Click here to Skip to main content
15,888,733 members
Home / Discussions / Algorithms
   

Algorithms

 
GeneralRe: Algorithm to find a given number is prime or not. Pin
Member 110631236-Sep-14 7:53
Member 110631236-Sep-14 7:53 
GeneralRe: Algorithm to find a given number is prime or not. Pin
Member 110631237-Sep-14 6:23
Member 110631237-Sep-14 6:23 
GeneralRe: Algorithm to find a given number is prime or not. Pin
Member 110631237-Sep-14 6:23
Member 110631237-Sep-14 6:23 
AnswerRe: Algorithm to find a given number is prime or not. Pin
Richard MacCutchan6-Sep-14 5:06
mveRichard MacCutchan6-Sep-14 5:06 
AnswerRe: Algorithm to find a given number is prime or not. Pin
PIEBALDconsult14-Nov-14 3:37
mvePIEBALDconsult14-Nov-14 3:37 
QuestionHow to describe method: SQL for frequent pattern discovery Pin
pseudogrammaton4-Sep-14 18:01
pseudogrammaton4-Sep-14 18:01 
AnswerRe: How to describe method: SQL for frequent pattern discovery Pin
Bernhard Hiller4-Sep-14 20:52
Bernhard Hiller4-Sep-14 20:52 
GeneralRe: How to describe method: SQL for frequent pattern discovery Pin
pseudogrammaton5-Sep-14 2:55
pseudogrammaton5-Sep-14 2:55 
Oh, I forgot to mention that the trigram dictionary is trimmed by abs(val1+val2+val3) <= 88 (it's a vector dataset of small int). But the 5.5m trigram dictionary might not slow things much given the use of covering indices (the access is all via b-tree indices, obviating the need for as much memory).

I looked into using a 4-gram dictionary but it presented a very large table, much larger than the 3-gram dictionary, and worse it made for more overlapping duplicates in building the equivalent of an FP-tree (at least 1 extra overlap, whereas w/ the 3-gram matches I'm always overlapping by n+2). Also I sense a trigram-based tree innately reflects the smallest useful vector of from-and-to applicable.

One problem might be that in high-frequency datasets I could see an explosion of 3-gram noise that doesn't always support better (longer) matches, bloating the output. I understand that in FP-Tree algo's there's a minimum support criterion that perhaps works around this. There may be a way to ameliorate this in SQL, such as checking for matching adjacencies in a manner that'll optimize via a correlated subquery, using ANY or [NOT] IN.

I haven't had enough time to fully experiment with various datasets, I've been going through a application language selection process** & am contemplating looking into PostGIS' geometric data & index features (R-Tree indices) as a way to get better, longer string matches, even perhaps supporting approximate or intermittent matches akin to the ability of LCS/cosine match algorithms (but on larger sets with more expressive syntax).

I'm prepared to start coding in C or Julia, but I'll avoid it if Postgres proves "fast enough." That's b/c as new data are imported to the DBMS I'll want to rerun pattern discoveries in the background against the main data store. My current 10 million rows are an exorbitant sample, out-scaling anything I expect to encounter in the actual data (MIDI note vectors).

[Edit:]
Just found this:
https://www.academia.edu/5184801/SQL_Based_Frequent_Pattern_Mining_with_FP-growth[^]

It's a very old paper (circa 2001?), but I'll probably be following their methodology. Maybe they gave up b/c DB/2 was too slow vs. algorithmic FP-Growth in C++.

Also, from a 2006 paper: http://webdocs.cs.ualberta.ca/~zaiane/postscript/adma05.pdf

"...In this work we presented COFI-Closed, an algorithm for mining frequent
closed patterns. This novel algorithm is based on existing data structures FP-tree
and COFI-tree. Our contribution is a new way to mine those existing structures
using a novel traversal approach. Using this algorithm, we mine extremely large
datasets, our performance studies showed that the COFI-Closed was able to
mine efficiently 100 million transactions in less than 3500 seconds on a small
desktop while other known approaches failed..."

I know that's old hat by now (my 2008-era Thinkpad T400 Celeron laptop w/ its 4GB RAM & SSD drive vs. his "small desktop"), but I'm in the ballpark.

**( As for fast application langs, esp. for other algorithmic jobs not best modeled in a SQL DBMS, I think I just found a winner: http://www.julialang.org[^] & http://juliawebstack.org/[^] )

modified 5-Sep-14 10:57am.

QuestionBoundless Binary Search Pin
Igor van den Hoven27-Aug-14 10:24
Igor van den Hoven27-Aug-14 10:24 
GeneralRe: Boundless Binary Search Pin
harold aptroot4-Sep-14 21:28
harold aptroot4-Sep-14 21:28 
GeneralRe: Boundless Binary Search Pin
Igor van den Hoven6-Sep-14 10:04
Igor van den Hoven6-Sep-14 10:04 
GeneralRe: Boundless Binary Search Pin
harold aptroot6-Sep-14 11:10
harold aptroot6-Sep-14 11:10 
GeneralRe: Boundless Binary Search Pin
Igor van den Hoven6-Sep-14 12:32
Igor van den Hoven6-Sep-14 12:32 
GeneralRe: Boundless Binary Search Pin
harold aptroot6-Sep-14 12:52
harold aptroot6-Sep-14 12:52 
GeneralRe: Boundless Binary Search Pin
Igor van den Hoven6-Sep-14 15:07
Igor van den Hoven6-Sep-14 15:07 
GeneralRe: Boundless Binary Search Pin
harold aptroot6-Sep-14 20:25
harold aptroot6-Sep-14 20:25 
AnswerRe: Boundless Binary Search Pin
Alan Balkany5-Nov-15 0:21
Alan Balkany5-Nov-15 0:21 
QuestionSurface calculation (sphere) - projection effect. [SOLVED ?] Pin
V.3-Aug-14 22:31
professionalV.3-Aug-14 22:31 
AnswerRe: Surface calculation (sphere) - projection effect. [SOLVED ?] Pin
Supreme Master27-Jul-15 9:24
Supreme Master27-Jul-15 9:24 
QuestionLock free algorithms Pin
Joe Woodbury1-Jul-14 8:15
professionalJoe Woodbury1-Jul-14 8:15 
AnswerRe: Lock free algorithms Pin
Michael Gazonda31-Jul-14 18:51
professionalMichael Gazonda31-Jul-14 18:51 
GeneralRe: Lock free algorithms Pin
Joe Woodbury31-Jul-14 19:21
professionalJoe Woodbury31-Jul-14 19:21 
GeneralRe: Lock free algorithms Pin
Michael Gazonda31-Jul-14 19:25
professionalMichael Gazonda31-Jul-14 19:25 
AnswerRe: Lock free algorithms Pin
SledgeHammer016-Sep-14 8:58
SledgeHammer016-Sep-14 8:58 
GeneralRe: Lock free algorithms Pin
Joe Woodbury6-Sep-14 9:50
professionalJoe Woodbury6-Sep-14 9:50 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.