|
Wow amazing response, thank you for your fast reply.
Im thinking if it is overkill now to create a Table for the images rather than using the blob but i like the idea of table for the images because it feels right in terms of database design.
However if i have a table for the images (so if i have a List<templateimage>) then i will not have a blob of a List<int> to make a md5 hash of it right? In that case if i want to check if a template with the same images exist i would have to retrieve all the images and check them one by one. If there is a way to md5 the List<templateimage> it would be the best i think.
I found also this c# - Create Hash Value on a List? - Stack Overflow[^] maybe i can use this to hash a List<objects>?
|
|
|
|
|
Exoskeletor wrote: i would have to retrieve all the images and check them one by one you would retrieve all the hashes, not the images, compare the hashes, not the images. You would only get the images if you need to display them, everything else would use the hashes.
You could also use the database to compare the hash using a where clause on your sql query.
Never underestimate the power of human stupidity -
RAH
I'm old. I know stuff - JSOP
|
|
|
|
|
Isn't it faster to hash the whole list<images> instead of comparing 5 hashes with a set of other 5 hashes?
|
|
|
|
|
Yes, but the business case would need to state that all 5 images must be identical.
Never underestimate the power of human stupidity -
RAH
I'm old. I know stuff - JSOP
|
|
|
|
|
You mean that it is better in terms of programming design to manually check each image and not the list of images?
images are stored by their resource id which is unique, i could use this as a hash right?
|
|
|
|
|
i would need 5 queries to check if the 5 images exist in the DB, this is not expensive?
|
|
|
|
|
This decision cannot be driven by programming requirements but is a use/business case requirement. Do you need to know if all of the 5 images are identical between records or do you need to know if one of the 5 images is different. Does the sequence of images impact on the differences of the group.
Never underestimate the power of human stupidity -
RAH
I'm old. I know stuff - JSOP
|
|
|
|
|
i prefer to know if all of them are identical, sequence is important when the app is running ,im getting them in the sequence i want with this code:
public static int GetSequenceHashCode<T>(this IList<T> sequence)
{
const int seed = 487;
const int modifier = 31;
unchecked
{
return sequence.Aggregate(seed, (current, item) =>
(current * modifier) + item.GetHashCode());
}
}
public static void AddTemplate(int category, List<int> images)
{
var tmpl = new Template()
{
Category = category,
};
var img1 = new TemplateImage()
{
Category = category,
Image = images[0],
};
var img2 = new TemplateImage()
{
Category = category,
Image = images[1],
};
var img3 = new TemplateImage()
{
Category = category,
Image = images[2],
};
var img4 = new TemplateImage()
{
Category = category,
Image = images[3],
};
var img5 = new TemplateImage()
{
Category = category,
Image = images[4],
};
tmpl.TemplateImages = new List<TemplateImage>() { img1, img2, img3, img4, img5 };
tmpl.ImagesHash = tmpl.TemplateImages.GetSequenceHashCode();
var result = DatabaseHelper.db().Query<Template>("Select * From Templates where ImagesHash=?", tmpl.ImagesHash).ToList();
if (result.Count == 0)
{
DatabaseHelper.db().InsertAll(tmpl.TemplateImages);
DatabaseHelper.db().Insert(tmpl);
DatabaseHelper.db().UpdateWithChildren(tmpl);
var employeeStored = DatabaseHelper.db().GetWithChildren<Template>(tmpl.Id);
}
however the GetSequenceHashCode doesnt work as expected, it gives different results on every run, so i will have to check with another code.
For the current state of the app if only one image is found is enough but i might need in the future to know if all of the images are the same (and exist all of them in one template).
I can get what i want with this(check if all images exist i mean):
var result = DatabaseHelper.db().Query<TemplateImage>("Select * from TemplateImages where Image=?", images[0]).Count +
DatabaseHelper.db().Query<TemplateImage>("Select * from TemplateImages where Image=?", images[1]).Count +
DatabaseHelper.db().Query<TemplateImage>("Select * from TemplateImages where Image=?", images[2]).Count +
DatabaseHelper.db().Query<TemplateImage>("Select * from TemplateImages where Image=?", images[3]).Count +
DatabaseHelper.db().Query<TemplateImage>("Select * from TemplateImages where Image=?", images[4]).Count;
;
if (result.Count == 5)
But looks very bad to me, will this have impact on performance? is there any other way for the same result?
modified 8-Mar-20 22:23pm.
|
|
|
|
|
Exoskeletor wrote: however the GetSequenceHashCode doesnt work as expected, it gives different results on every run, so i will have to check with another code. Try the md5 hash; it will be consistent between runs.
Exoskeletor wrote: But looks very bad to me, will this have impact on performance? is there any other way for the same result? You could combine those five queries into one. Something like "SELECT Id FROM TemplateImages WHERE ImageHash IN (@value1, @value2, @value3, @value4, @value5". If the values already exist in the table, such a query would give you their Id's.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
|
The blob (the image). There's more than one way to create a hash, and there are different algorithms that generate different hashes. MS recommends using SHA256, but the idea is the same. See MD5 Class (System.Security.Cryptography) | Microsoft Docs[^]
Once you can do that, you can save the hash with the image. If you want to check whether the image is already in the database, you hash your image and query the database to see if it is already there.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
Im not using a blob any more, i follow your suggestion and now i use this:
[Table("Templates")]
public class Template
{
[PrimaryKey, AutoIncrement]
public int Id { get; set; }
public int Category { get; set; }
[OneToMany]
public List<TemplateImage> TemplateImages { get; set; }
public int ImagesHash { get; set; }
}
[Table("TemplateImages")]
public class TemplateImage
{
[PrimaryKey, AutoIncrement]
public int Id { get; set; }
public int Category { get; set; }
public int Image { get; set; }
[ForeignKey(typeof(Template))]
public int TemplateId { get; set; }
}
So the question is, how i can md5 List<templateimage>? (or another consistent hash)
Quote: You could combine those five queries into one. Something like "SELECT Id FROM TemplateImages WHERE ImageHash IN (@value1, @value2, @value3, @value4, @value5". If the values already exist in the table, such a query would give you their Id's.
why every image to have a hash? what hash will they have? i dont understand,
this is how i can check if one image of the new template im trying to pass, exist on a template from the DB
var result = DatabaseHelper.db().Query<TemplateImage>("Select * from TemplateImages where Image=?", images[0]);
If i can find a single query that can do that for all the images, i think im good
modified 9-Mar-20 6:03am.
|
|
|
|
|
Exoskeletor wrote: So the question is, how i can md5 List<templateimage>? (or another consistent hash) Not the list, but the items in it; for each image, you want something that represents it so you can compare the item to other items. If you were to load each image and compare them pixel by pixel, the proces would be slow. Getting a fingerprint for each image, you could compare those - they're a lot shorter. If each image has a fingerprint stored, than you can check if the same image is there by checking if its fingerprint is there.
Exoskeletor wrote: If i can find a single query that can do that for all the images, i think im good I gave an example of that in the previous post. You'd need a query that checks for at least five values, no?
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
you gave an example with a imagehash. what is the imagehash in your example? how i can get a hash from the List<templateimage> or from the images?
|
|
|
|
|
An example is Calculate MD5 Checksum for a File using C#[^]; that's assuming your image is a file - and it will return a string with the hashed value. Write a small project to hash a single image that is a file and that shows the result, then you can work from there. The resulting string will change whenever the content of the image changes.
In your project, you'd have to compute the hash when you store the image in the database. Get to know the concept first, before trying to add it to an existing project
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
my images are xml files that exist within android app, with simple string code inside, but how i can use that to my case? if i hash every image then i will have 5 hashes, how i can store 5 hashes and check them when im inserting a new template with one query?
|
|
|
|
|
I think you are right, i have to md5 the file, and not the resource id which is generated on code compilation, thanks for that. i don't know how i can hash all of them and store them together as a single hash but for now only i will check only one image if already exist
|
|
|
|
|
Exoskeletor wrote: how i can store 5 hashes The same way you store the images. Add a string-column next to the blob itself. Calculate the hash from the blob, and write them at the same time.
Exoskeletor wrote: and check them when im inserting a new template with one query? You don't check during the insert, but prior. You weren't trying to compare images during the insert before; same applies here. You can check if any of the five hashes exists in the database with a single query though; but one problem at a time - it would work with five queries too, and you can rewrite to use a single query later on.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
which blob? i don't use blobs any more, you mean to create a blod in order to hash it? can you give me a table structure cause im getting confused, this is my structure now
[Table("Templates")]
public class Template
{
[PrimaryKey, AutoIncrement]
public int Id { get; set; }
public int Category { get; set; }
[OneToMany, Indexed(Name = "TemplateImagesUnique", Unique = true)]
public List<TemplateImage> TemplateImages { get; set; }
public string ImagesHash { get; set; }
}
[Table("TemplateImages")]
public class TemplateImage
{
[PrimaryKey, AutoIncrement]
public int Id { get; set; }
public int Category { get; set; }
public int Image { get; set; }
[ForeignKey(typeof(Template))]
public int TemplateId { get; set; }
}
|
|
|
|
|
Exoskeletor wrote: which blob? i don't use blobs any more With the blob I meant the image you're storing.
[Table("TemplateImages")]
public class TemplateImage
{
[PrimaryKey, AutoIncrement]
public int Id { get; set; }
public int Category { get; set; }
public stream Image { get; set; }
public string ImageHash { get; set; }
[ForeignKey(typeof(Template))]
public int TemplateId { get; set; }
}
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
oh im storing the images as int cause this int represent the ResourceId of the image, i thought that since the images are stored inside the local app folder that is more efficient to store the
ResourceId rather than the image itself (which is a string, my images are vectors .xml files)
Is this a bad idea?
Im starting thinking that this is a very silly idea because i guess the Resource id is generated every time i compile a new version of my app, so im trying to create a unique hash of something that is not unique
|
|
|
|
|
What if i do this:
1) Get the xml content as string
2) convert them with +s from all the images to one string,
3) md5hash this string?
It sounds like an overkill although, i think i have to read the images as files, and in order to do that i have to open a stream, this sounds very heavy to me, cause i also have to move the images to assets, android can't read images from drawable folder.
I think im coming to a dead end and the only solution is to delete and recreate the database every time the app is opened, or every time i made a new update of the app, im not sure
modified 9-Mar-20 8:06am.
|
|
|
|
|
Exoskeletor wrote: 2) convert them with +s from all the images to one string, Not "all" the images; you need a separate fingerprint for each image.
Exoskeletor wrote: i think i have to read the images as files, and in order to do that i have to open a stream, this sounds very heavy to me You'd have to read the images anyway to use them; opening them as a stream is not heavy, but the most used way to handle binary data. Since it is XML-data and not binary, you may get away with simply reading that string without needing a stream.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
but if i have a seperate hash for every image where im going to store it? in the templateimage table? one hash for each image?
|
|
|
|
|
Exoskeletor wrote: one hash for each image? Yes; since each hash represents the content in the image, in another form. If you have that for each image, then looking if you already have it is as easy as getting the hash from the new image (the one you want to compare to the ones already there), and see if it is in there.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|