Click here to Skip to main content
15,886,518 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Hey guys!

Not sure if the title is the best, but I will attempt to describe what I need to do and trust that someone can help me to find a way to accomplish the task?

I work with the VB.NET language and I created an app that makes wordlists from text. However, I work with a foreign language and would like to create a feature that helps me and other colleagues with different languages.

The Problem: The following is a short list of words in the language I work with divided manually into prefix/infix/root-radical/suffix. I have been searching for a very long time for a way to:

1. Search through a wordlist (10;100;1000s of words) and (in a similar way to longest prefix)...
2. Search all prefixes in the list until only the root/radical is left.
3. Group words with similar prefixes splitting prefixes/infixes/root-radical/suffixes into separate columns on a Listview in a similar way to the following list.

List of Words:

a-ti-a-nug
a-ti-a-pun
a-ti-atu-mohey
a-ti-atu-mõ'e
a-ti-kuap
a-ti-wyro
a-ti-po-wyro
a-tu-uka
a-tu-nug
a-to-pi'ig
a-to-py-tyk


I'm sure there are a number of different methods which can achieve this, i.e. REGEX, etc. But, after much searching I am drawing a blank! If anyone can help with this problem I would greatly appreciate it!

What I have tried:

I have tried the longest prefix approach, but it searches the list until the longest prefix is found and then gives the longest prefix as the result, i.e. only one word. I need to achieve something similar to the list above with some flexibility (i.e. a numeric spinner or up/down to count how many characters to split into each column?) because it needs to work for more than one language.
Posted
Updated 12-Jul-21 10:47am
v2
Comments
[no name] 12-Jul-21 22:07pm    
Languages / alphabets / phonetics use rules and tables for the different languages; there is no magic solution. They also go through versions; so it's also date and author sensitive.
Member 13032047 13-Jul-21 14:11pm    
Thanks for your input Gerry! I'm not looking for a magic solution. Searching for the longest prefix is a programmatic way of automating a previously manual task! If you only have to repeat a task once or twice there is no reason to use automation. I am looking for a solution that basically does what the longest prefix or suffix operations do, but groups and lists all words with the same prefix or suffix (rather than just one word) and also separates the morphemes into columns. Seems like a perfectly viable application to me???
Peter_in_2780 13-Jul-21 3:49am    
This sounds like a database application. If you have already manually split the words, it should be straightforward to import them into a database where each record contains the parts of the word as fields. Then you can use any of the query tools available in your database of choice.
Member 13032047 13-Jul-21 14:17pm    
Thanks for your input Peter! I actually did the manual splitting in Notepad! The final app I hope to create needs to automate the process. However, both the splitting process and any other subsequent processes need some method to automate them? Not familiar with using a database? Because the app I hope to create works with "unknown data which contains recognisable patterns (i.e. Roman alphabet or installed fonts)." I'm thinking along the lines of pattern recognition, i.e. comparing all words in a list to find patterns and report the results as columns in a ListView? Any ideas?
[no name] 14-Jul-21 10:53am    
Sounds like "rules and tables": programming.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900