Click here to Skip to main content
15,123,819 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
ok issues im having here are they are not in href comment or src this is a sniplet of the code i want to get but i also want to get it by size so like all images that are 2048 x 2048 same with 4065 x 4065

HTML
<pre>#34;, "createdAt": "2018-07-30T13:33:21.373947"}, {"uid": "c442c352934545b183e16ce9aebd91cb", "width": 2048, "options": {"format": "R", "quality": 88}, "updatedAt": "2018-08-01T17:51:24.738232", "height": 2048, "size": 618478, "url": "https://media.sketchfab.com/urls/ea1adc30399045a2b101e16ba65a856f/dist/textures/a4291782af5f4ce39e637c89ec91fa9b/c442c352934545b183e16ce9aebd91cb.jpeg", "createdAt": "2018-08-01T17:51:25.334608"}, {"uid": "84275b9d01b54836893e355991288c2f", "width": 1024, "options": {"format": "R", "quality": 92}, "updatedAt": "2018-08-01T17:51:25.341010", "height": 1024, "size": 220485, "url": "https://media.sketchfab.com/urls/ea1adc30399045a2b101e16ba65a856f/dist/textures/a4291782af5f4ce39e637c89ec91fa9b/84275b9d01b54836893e355991288c2f.jpeg", "createdAt": "2018-08-01T17:51:25.451079"}, {"uid": "88897653dc004ded9faee4eaf2fa0373", "width": 512, "options": {"format": "R", "quality": 95}, "updatedAt": "2018-08-01T17:51:25.456671", "height": 512, "size": 83896, "url": "https://media.sketchfab.com/urls/ea1adc30399045a2b101e16ba65a856f/dist/textures/a4291782af5f4ce39e637c89ec91fa9b/88897653dc004ded9faee4eaf2fa0373.jpeg"



what i want to do is check the with and hight and if it matches 2048 x2048 then exact that image and save to folder same with the 4096 x 4096

i managed to make regex for the long link
(https://media.sketchfab.com)/urls/[a-z0-9]+/dist/textures/[a-z0-9]+/[a-z0-9]+.jpeg


but not sure how to get it to download all images depending on size if anyone could help would be much appriated really suck at this thanks in advance elfenliedtopfan5

What I have tried:

(https://media.sketchfab.com)/urls/[a-z0-9]+/dist/textures/[a-z0-9]+/[a-z0-9]+.jpeg


C#
string urlAddress = "https://sketchfab.com/3d-models/mossberg-590-tactical-ea1adc30399045a2b101e16ba65a856f";
string urlBase = "https://sketchfab.com";

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(urlAddress);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
string data = "";
if (response.StatusCode == HttpStatusCode.OK)
{
    Stream receiveStream = response.GetResponseStream();
    StreamReader readStream = null;
    if (response.CharacterSet == null)
        readStream = new StreamReader(receiveStream);
    else
        readStream = new StreamReader(receiveStream, Encoding.GetEncoding(response.CharacterSet));
    data = readStream.ReadToEnd();
    response.Close();
    readStream.Close();
}
MatchCollection matches = Regex.Matches(data, @"(https://media.sketchfab.com)/urls/[a-z0-9]+/dist/textures/[a-z0-9]+/[a-z0-9]+.jpeg");
for (int a = 0; a < matches.Count; a++)
    MessageBox.Show(urlBase + matches[a].Groups["link"].Value);
Posted
Updated 19-May-19 22:22pm
Comments
DerekT-P 18-May-19 15:19pm
   
You've got JSON data; why use Regex to parse this rather than just parsing the JSON (google JSON.Net) ... ?
phil.o 19-May-19 14:12pm
   
You should post a solution out of this advise, and take credit for it.

1 solution

As DerekTP123 said, don't use regex for this. Work with the JSON object - you have width, height and URL just as you need it. it will make the code simpler to read, easier to maintain, and less error prone.

Sure you will have to learn how to deal with JSON, but as JSON is everywhere - you can as well learn it and get the benefits right away.
The most popular library is JSON.NET - so I recommend you stick to that, as you will find most examples using this. Microsoft use it as well.

Google something like: Json.Net tutorial.

In general you have two options - Write a C# object representing the JSON and have JSON.NET fill it out. A bit more typing initially, but then it is easy to use. See more here[^].

Alternatively you can read the object as a JsonObject. Then you have to index all the properties yourself. You won't need any C# classes representing the json - but you will have to make sure you type the right property names when using it - the compiler won't help you. Some examples here[^].

If - as you mention in the comment - it isn't easy to access the JSON for one or another reason, it is possible to write a regex, just be prepared to fiddle with it regularly to keep it working.

I have created an example that seems to work:
"width"\s*:\s*(?'width'\d+).+?"height"\s*:\s*(?'height'\d+).+?"url"\s*:\s*\"(?'url'.+?)"

It relies on the order of the properties in the json object - something you really shouldn't do as the property order isn't significant in json - meaning the code writing it could change it around for no apparent reason - but it is not easy to write a simple regex that can handle reordering.

The regex is pretty straight forward as regexes go. First it looks for "width", a colon (with optional whitespace around it), followed by a named caption group 'width' taking the next digits. The ?'width' just inside the parenthesis defines the name. It is not necessary, but it makes it easier to extract the values later in a robust way - as future changes could add additional groups (a group being anything in parenthesis).

It then skips until "height" which is captured the same way. Notice the skipping over other json properties is done using .+?. The trailing ? tells the regex to "stop" at the first possible opportunity (so the first time it can match "height". Without this, the regex is "greedy" - so it will match as much as it can. This would make it read the first width in your text, then skip all the way to the last height - and you would only get a single match.

Finally it skips to "url" and creates a new group capturing the text inside the following quotes - again using a non-greedy match to make sure it stop at the first quote instead of eating your entire text in one go.

You can play with making it more restrictive to avoid false positives of course. I recommend you use an online validator to quickly see the result of your changes, maybe something like regex101.com[^]

Once you have the URL it is easy to download the image (as long as the site does not try to stop you).

I recommend you look at WebClient[^]

Specifically the methods OpenRead, DownloadData, and DownloadFile. You can use any of them, but depending on what you want to do with the image, one will most likely offer a more convenient output than the other two. You can also replace your HttpWebRequest/HttpWebResponse with a WebClient. It will do all work reading the response stream for you (basically it just wraps the HttpWebRequest/Response and will do all the boring stuff for you).
   
v4
Comments
elfenliedtopfan5 20-May-19 11:16am
   
yeah i looked into json yesterday but the issue being with this is there are about 30 occorances of this in a html site and with json you have to put id : "sometexthere" and the issue being i dont know what some of the names are because there hashed so i not sure how to correctly accoumplush this.
lmoelleb 20-May-19 14:00pm
   
I added an example regex - the included link to regex101.com will take you to a site where it is already filled in allowing you to play around with it easily.
elfenliedtopfan5 20-May-19 18:23pm
   
umm thats great only issue i have is it gives me loads of errors due to formatting of the regex i think like \ ? error
lmoelleb 20-May-19 23:58pm
   
I click the link to regex101 and see no errors. Where do you see errors, and what are the errors. Please never report something like "load of errors", always include at least one precise example of what the error text is and where you see it. The regex contains " and \, which must be escaped when you copy it as a string into your c# source - is that the problem? I can't say for sure without precise information.
elfenliedtopfan5 21-May-19 10:43am
   
Sorry for lack of info it was really early in morning last night so basicly yes when i copyied it across to c# as a patten i got errors on \s ? but i assume thats because its quoted and there \ i think means something in c# as well

but with a hell of a lot of editing i managed to come up with this

@"""width""\s*:\s*(?'width'\d+).+?""height""\s*:\s*(?'height'\d+).+?""url""\s*:\s*\""(?'url'.+?)"""

but regex still just skips the found match part
https://i.imgur.com/SDJsYff.gifv
lmoelleb 21-May-19 14:05pm
   
So search replace " -> "" is a hell of a lot of editing - I have some bad news about how simple that editing is in the context of programming. :)

Did you look careful at the source text? Copy paste it from the debugger into regex101 and check it match. When I look at the source from the website, I find HTML escaped text. So it is not "width". It seems code project is automatically adjusting the text we write, hence the example in your post no longer match the source). So either change HTML decode the source first, or adjust the regex to match " instead of ". Regex101 can help you with this - just copy the source text over, start building up the regex based on my example. Do NOT copy over the entire regex, type from the start and see the match changing as you go. If you copy all at once, nothing will match and you can't see when you make the regex either better or worse.
elfenliedtopfan5 21-May-19 15:10pm
   
i have downloaded the sorce of a webpage i just cant seem to find the right foumla and what do you mean my code is automaticly ajusting the text ?
lmoelleb 21-May-19 15:20pm
   
Copy the source text of the web page from the debugger and paste it into the Test String field on regex191.com. Now start typing the regex. Observe how regex101.com will show you what is matched. Hover the mouse over the colored parts of the regex, and it will give you a sentence telling you what it matches. If you just keep trying stuff in C# you will never get it right. If you are not willing to use regex101.com or a similar tool, then regex is not for you, and you should write the code with IndexOf and Substring etc. Once you have it working on regex101, move it to C# (making sure to escape it as you did earlier). Still can't figure it out, google "regex tutorial" - there are tons of them. If all you want is someone to supply you the regex, then you are better of avoid regexes in that case. They are useful, but you need to invest time in learning how to use them.
elfenliedtopfan5 21-May-19 19:26pm
   
ok i managed to get it to work and download but the only issue i have now is there a way to do it by block code as thats all one when it names the textures they dont corrisond to the right name so is there a way to get name with hight and then url then re do it each time ?
lmoelleb 22-May-19 1:24am
   
Great. For doing it block by block, this is beyond what you would do with a single regex. Loop over the blocks and then use your regex with the text of each block. You can use another regex to identify the blocks - or often string.IndexOf and string.SubString will do.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900