Click here to Skip to main content
15,867,308 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
ok issues im having here are they are not in href comment or src this is a sniplet of the code i want to get but i also want to get it by size so like all images that are 2048 x 2048 same with 4065 x 4065

HTML
<pre>#34;, "createdAt": "2018-07-30T13:33:21.373947"}, {"uid": "c442c352934545b183e16ce9aebd91cb", "width": 2048, "options": {"format": "R", "quality": 88}, "updatedAt": "2018-08-01T17:51:24.738232", "height": 2048, "size": 618478, "url": "https://media.sketchfab.com/urls/ea1adc30399045a2b101e16ba65a856f/dist/textures/a4291782af5f4ce39e637c89ec91fa9b/c442c352934545b183e16ce9aebd91cb.jpeg", "createdAt": "2018-08-01T17:51:25.334608"}, {"uid": "84275b9d01b54836893e355991288c2f", "width": 1024, "options": {"format": "R", "quality": 92}, "updatedAt": "2018-08-01T17:51:25.341010", "height": 1024, "size": 220485, "url": "https://media.sketchfab.com/urls/ea1adc30399045a2b101e16ba65a856f/dist/textures/a4291782af5f4ce39e637c89ec91fa9b/84275b9d01b54836893e355991288c2f.jpeg", "createdAt": "2018-08-01T17:51:25.451079"}, {"uid": "88897653dc004ded9faee4eaf2fa0373", "width": 512, "options": {"format": "R", "quality": 95}, "updatedAt": "2018-08-01T17:51:25.456671", "height": 512, "size": 83896, "url": "https://media.sketchfab.com/urls/ea1adc30399045a2b101e16ba65a856f/dist/textures/a4291782af5f4ce39e637c89ec91fa9b/88897653dc004ded9faee4eaf2fa0373.jpeg"



what i want to do is check the with and hight and if it matches 2048 x2048 then exact that image and save to folder same with the 4096 x 4096

i managed to make regex for the long link
(https://media.sketchfab.com)/urls/[a-z0-9]+/dist/textures/[a-z0-9]+/[a-z0-9]+.jpeg


but not sure how to get it to download all images depending on size if anyone could help would be much appriated really suck at this thanks in advance elfenliedtopfan5

What I have tried:

(https://media.sketchfab.com)/urls/[a-z0-9]+/dist/textures/[a-z0-9]+/[a-z0-9]+.jpeg


C#
string urlAddress = "https://sketchfab.com/3d-models/mossberg-590-tactical-ea1adc30399045a2b101e16ba65a856f";
string urlBase = "https://sketchfab.com";

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(urlAddress);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
string data = "";
if (response.StatusCode == HttpStatusCode.OK)
{
    Stream receiveStream = response.GetResponseStream();
    StreamReader readStream = null;
    if (response.CharacterSet == null)
        readStream = new StreamReader(receiveStream);
    else
        readStream = new StreamReader(receiveStream, Encoding.GetEncoding(response.CharacterSet));
    data = readStream.ReadToEnd();
    response.Close();
    readStream.Close();
}
MatchCollection matches = Regex.Matches(data, @"(https://media.sketchfab.com)/urls/[a-z0-9]+/dist/textures/[a-z0-9]+/[a-z0-9]+.jpeg");
for (int a = 0; a < matches.Count; a++)
    MessageBox.Show(urlBase + matches[a].Groups["link"].Value);
Posted
Updated 19-May-19 21:22pm
Comments
DerekT-P 18-May-19 15:19pm    
You've got JSON data; why use Regex to parse this rather than just parsing the JSON (google JSON.Net) ... ?
phil.o 19-May-19 14:12pm    
You should post a solution out of this advise, and take credit for it.

1 solution

As DerekTP123 said, don't use regex for this. Work with the JSON object - you have width, height and URL just as you need it. it will make the code simpler to read, easier to maintain, and less error prone.

Sure you will have to learn how to deal with JSON, but as JSON is everywhere - you can as well learn it and get the benefits right away.
The most popular library is JSON.NET - so I recommend you stick to that, as you will find most examples using this. Microsoft use it as well.

Google something like: Json.Net tutorial.

In general you have two options - Write a C# object representing the JSON and have JSON.NET fill it out. A bit more typing initially, but then it is easy to use. See more here[^].

Alternatively you can read the object as a JsonObject. Then you have to index all the properties yourself. You won't need any C# classes representing the json - but you will have to make sure you type the right property names when using it - the compiler won't help you. Some examples here[^].

If - as you mention in the comment - it isn't easy to access the JSON for one or another reason, it is possible to write a regex, just be prepared to fiddle with it regularly to keep it working.

I have created an example that seems to work:
"width"\s*:\s*(?'width'\d+).+?"height"\s*:\s*(?'height'\d+).+?"url"\s*:\s*\"(?'url'.+?)"

It relies on the order of the properties in the json object - something you really shouldn't do as the property order isn't significant in json - meaning the code writing it could change it around for no apparent reason - but it is not easy to write a simple regex that can handle reordering.

The regex is pretty straight forward as regexes go. First it looks for "width", a colon (with optional whitespace around it), followed by a named caption group 'width' taking the next digits. The ?'width' just inside the parenthesis defines the name. It is not necessary, but it makes it easier to extract the values later in a robust way - as future changes could add additional groups (a group being anything in parenthesis).

It then skips until "height" which is captured the same way. Notice the skipping over other json properties is done using .+?. The trailing ? tells the regex to "stop" at the first possible opportunity (so the first time it can match "height". Without this, the regex is "greedy" - so it will match as much as it can. This would make it read the first width in your text, then skip all the way to the last height - and you would only get a single match.

Finally it skips to "url" and creates a new group capturing the text inside the following quotes - again using a non-greedy match to make sure it stop at the first quote instead of eating your entire text in one go.

You can play with making it more restrictive to avoid false positives of course. I recommend you use an online validator to quickly see the result of your changes, maybe something like regex101.com[^]

Once you have the URL it is easy to download the image (as long as the site does not try to stop you).

I recommend you look at WebClient[^]

Specifically the methods OpenRead, DownloadData, and DownloadFile. You can use any of them, but depending on what you want to do with the image, one will most likely offer a more convenient output than the other two. You can also replace your HttpWebRequest/HttpWebResponse with a WebClient. It will do all work reading the response stream for you (basically it just wraps the HttpWebRequest/Response and will do all the boring stuff for you).
 
Share this answer
 
v4
Comments
elfenliedtopfan5 20-May-19 11:16am    
yeah i looked into json yesterday but the issue being with this is there are about 30 occorances of this in a html site and with json you have to put id : "sometexthere" and the issue being i dont know what some of the names are because there hashed so i not sure how to correctly accoumplush this.
lmoelleb 20-May-19 14:00pm    
I added an example regex - the included link to regex101.com will take you to a site where it is already filled in allowing you to play around with it easily.
elfenliedtopfan5 20-May-19 18:23pm    
umm thats great only issue i have is it gives me loads of errors due to formatting of the regex i think like \ ? error
lmoelleb 20-May-19 23:58pm    
I click the link to regex101 and see no errors. Where do you see errors, and what are the errors. Please never report something like "load of errors", always include at least one precise example of what the error text is and where you see it. The regex contains " and \, which must be escaped when you copy it as a string into your c# source - is that the problem? I can't say for sure without precise information.
elfenliedtopfan5 21-May-19 10:43am    
Sorry for lack of info it was really early in morning last night so basicly yes when i copyied it across to c# as a patten i got errors on \s ? but i assume thats because its quoted and there \ i think means something in c# as well

but with a hell of a lot of editing i managed to come up with this

@"""width""\s*:\s*(?'width'\d+).+?""height""\s*:\s*(?'height'\d+).+?""url""\s*:\s*\""(?'url'.+?)"""

but regex still just skips the found match part
https://i.imgur.com/SDJsYff.gifv

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900