Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Question
Thursday, September 24, 2009 3:58 AM
Hello everyone,
I was wandering how can one find a repeated pattern in a string. for example:
asdxgkeopgkajdflkjbpoijadadafhjkafikeoadkjhadfkjhocihakeo
If you see the keo is repeated. BUT that I don't know it in advance. I want a function that can identify a repeating pattern and save it saying me how many times it found it.
Thanks in advance for any help
idipous
All replies (12)
Thursday, September 24, 2009 5:17 AM ✅Answered
Hi,
I just tried this one. I am not tested well. May be it will help you.
string value = textBox1.Text.Trim();
int length=3;
ArrayList list = new ArrayList();
for (int i = 0; i < value.Length; i++)
{
if (i + length < value.Length)
{
string subValue = value.Substring(i, length);
list.Add(subValue);
i = i + length;
}
}
foreach (string lst in list)
{
foreach (string lstIn in list)
{
if (Regex.IsMatch(lstIn, lst, RegexOptions.IgnoreCase))
{
MessageBox.Show("Pattern= "+lst+" "+"match= "+lstIn);
}
}
}
Best Regards, C.Gnanadurai Please mark the post as answer if it is helpfull to you
Thursday, September 24, 2009 5:39 AM ✅Answered | 2 votes
Probably the best answer for this question is to use algorithms for pattern discovery.
But here is a code I wrote for your example.
private static Dictionary<string, int> FindPatterns(string value)
{
List<string> patternToSearchList = new List<string>();
for (int i = 0; i < value.Length; i++)
{
for (int j = 2; j <= value.Length / 2; j++)
{
if (i + j <= value.Length)
{
patternToSearchList.Add(value.Substring(i, j));
}
}
}
Dictionary<string, int> results = new Dictionary<string, int>();
foreach (string pattern in patternToSearchList)
{
int occurence = Regex.Matches(value, pattern, RegexOptions.IgnoreCase).Count;
if (occurence > 1)
{
results[pattern] = occurence;
}
}
return results;
}
static void Main(string[] args)
{
Dictionary<string, int> result = FindPatterns("asdxgkeopgkajdflkjbpoijadadafhjkafikeoadkjhadfkjhocihakeo");
foreach (KeyValuePair<string, int> res in result.OrderByDescending(r => r.Value))
{
Console.WriteLine("Pattern:" + res.Key + " occurence:" + res.Value.ToString());
}
Console.Read();
Note: This is not a very efficient way, using an algorith is the best solution.
Thursday, September 24, 2009 5:53 AM ✅Answered
For BoyerMore algorithm you can visit
http://www.codeproject.com/KB/recipes/BoyerMooreSearch.aspx
Thursday, September 24, 2009 6:44 AM ✅Answered | 1 vote
Hello, here is another possibility.
Code Snippet
- using System;
- using System.Collections.Generic;
- using System.Text.RegularExpressions;
- class Program
- {
- static void Main(string[] args) {
- string str = "asdxgkeopgkajdflkjbpoijadadafhjkafikeoadkjhadfkjhocihakeo";
- foreach (KeyValuePair<string, int> pair in GetPatterns(str, 3))
- {
- if (pair.Value > 1)
- {
- Console.WriteLine("{0}: {1}", pair.Key, pair.Value);
- }
- }
- Console.ReadKey();
- }
- private static IEnumerable<KeyValuePair<string, int>> GetPatterns(string value, int blockLength) {
- string currentBlock = string.Empty;
- IList<string> list = new List<string>();
- for (int i = 0; i < value.Length; i++)
- {
- if (i + blockLength <= value.Length)
- {
- currentBlock = value.Substring(i, blockLength);
- if (!list.Contains(currentBlock))
- {
- list.Add(currentBlock);
- MatchCollection match = Regex.Matches(value, currentBlock);
- yield return new KeyValuePair<string, int>(currentBlock, match.Count);
- }
- }
- }
- }
- }
Eyal, Regards.
Thursday, September 24, 2009 8:55 AM ✅Answered
This can also be done fairly efficiently by generating a suffix tree for the string.
See http://en.wikipedia.org/wiki/Suffix_tree
I don't have C# code for that though.
Thursday, September 24, 2009 9:27 AM ✅Answered
This can also be done fairly efficiently by generating a suffix tree for the string.
See http://en.wikipedia.org/wiki/Suffix_tree
I don't have C# code for that though.
Just doing a google search on the behalf of Matthew Watson. :)
http://code.google.com/p/csharsuffixtree/source/browse/#svn/trunk/suffixtree
Nice thing too, also.
http://en.wikipedia.org/wiki/Aho-Corasick_algorithmEyal, Regards.
Thursday, September 24, 2009 4:04 AM
What are your pattern? Are they prefined? If you don't know what are your patterns look like, it is impossible.
Thursday, September 24, 2009 5:41 AM
This is the basis for the PKZip deflate algorithm. Look at the reference source for DeflateStream.
Saturday, September 26, 2009 12:14 AM
Thank you guys for all the help. I will give it a try with the stuff here and let you know what worked better.
Saturday, September 26, 2009 12:55 PM
Hello everyone,
I was wandering how can one find a repeated pattern in a string. for example:
asdxgkeopgkajdflkjbpoijadadafhjkafikeoadkjhadfkjhocihakeo
If you see the keo is repeated. BUT that I don't know it in advance. I want a function that can identify a repeating pattern and save it saying me how many times it found it.
Thanks in advance for any help
idipous
You have more patterns besides keo, ad, ke, eo, kj, gk, ka, df, da, af, kjh, jh, ha
John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
Saturday, September 26, 2009 1:11 PM
Tamer's example caught every pattern.John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
Thursday, October 1, 2009 1:38 AM
Hi,
Does it work for you ?
HarryPlease remember to mark the replies as answers if they help and unmark them if they provide no help.
Welcome to the All-In-One Code Framework! If you have any feedback, please tell us.