|
LOL
well, this function is only one part of my program.
it's mainly for picture management.
one part is a duplicate file finder that 1st sorts out files that exist only once, then compares the left over file size-identical files by their hashes, display the results in a datagridview with the option to delete selected rows.
next part is the problem above. hashing all leftover unique files, comparing them with a hashtable (database), then display the results in a datagridview. if I click on a field it will show the image in an imagebox (resized) then I can delete the found files, select the next hashtable database and compare the leftover files without rehashing (I'm clever, LOL)
another part allows me to add new hashes to the databases
another part allows me to compare the databases with themself or each other, so I can throw out duplicate hashes
(and some other minor parts/functions I need)
and well, I can't complain, it does it's job, saves me a lot of time
and with your help it works a little better now
my problem is, I don't learn that well by books, I usually learn by doing.
for example: just recently I thought it might be faster to make a dataview using the hashtable and then search the hash of a new file in it. I found out it's WAY faster than how I did it before.
Before I had 2 FOR...NEXT loops, compare a row from table_a with every row in table_b. the hashtable grows and grows, so doing it that way takes a lot of time. 400,000,000 comparisons took more than 5 minutes, maybe 10.
now it's so fast it takes like 3 seconds or something
Thank you
|
|
|
|
|
Your hash values not being correct means your code is not OK; you have to fix that first, before you can start working on performance and multi-threading.
BTW: to keep things simple, I would not touch any DataTable/DataSet in the threads, just store the results in an array, or a Dictionary<string filename, int hashValue> ; then after the joins, enumerate and store the results.
|
|
|
|
|
Using multiple threads will be helpful if it's possible for the different threads to simultaneously be performing useful work, generally using different resources. If two or more threads need to read files off the same physical disk, each thread will likely have to wait any time other threads access the disk, so adding additional threads won't help anything.
Note that because the operating system has its own caching and read-ahead logic, it's often difficult to predict how any particular piece of code will perform. I'm sure that in general the system caching boosts performance, but it does make it much harder to optimize code.
|
|
|
|
|
Hello
yes, I noticed that caching thing. if I restart the process it usually hashes a lot faster
but that won't happen if I use new files or different folders, at least it seems so because it as slow as on 1st start
hashing simultaneously was my idea
the files in the folder are pictures, usual folder size is around 200 MB, though it can vary.
I'm not sure but I think it should at least be faster than trying to hash files that are 200 MB each, copying the files doesn't take that much time either.
I know of a (professional) program that can find duplicate images. It allows the user to set how many threads should be used for processing files. It processes files in a list which are added while scanning for these files in folders.
It works really fast, also uses a progressbar and shows div. informations while working.
And the hdd is really active during this.
It just hard to find informations on the WWW that show how to do that
Nik
|
|
|
|
|
I'd like to have a utility to find duplicate files and implement a reasonable backup/archiving approach. One thing I would think might be helpful with large files would be to start by producing a catalog of file sizes. If a file size is unique, one needn't hash anything to know that the file isn't going to match any other. Otherwise, for large files, one could compute a 'quick hash' value by hashing a few 64K chunks of data taken from different areas of the file. It two files have identical quick-hashes, they may or may not be identical, but if a file's quick-hash is unique that's a sure sign that the file is.
|
|
|
|
|
yes, 2 good ideas IMO.
1st one is how I do it, getting filename and filesize into a datatable, sort it by filesize, compare it with itself to find duplicate rows and import them into a new table.
then process that table, hash all files in it (I hash completely since I'm doing that with pictures which aren't bigger than 2-3 MB) and again compare the datatable, removing unique rows (hash column)
works very well
2nd describes what some duplicate file finders do, hashing the 1st bytes of files with same size, if they are identical it'll hash some more bytes until it finds a difference. at least this is what I found out from my research lol
|
|
|
|
|
I need to make a program that generates 15 random numbers as an array and then lists them in a listbox. I also need to be able to display the maximum and minimum in a label. Heres what I've got so far...
Public Class Form1
Dim strNumbers(14) As String
Private Sub btnGenerate_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnGenerate.Click
Dim intRandom As New Random()
Dim intLoop As Integer
lstOutcome.Items.Clear()
For intLoop = 0 To strNumbers.Length - 1
lstOutcome.Items.Add(intRandom.Next(1, 100))
Next intLoop
End Sub
Private Sub btnMaximum_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnMaximum.Click
Dim intLoop As Integer
Dim intMax As Integer
For intLoop = 0 To strNumbers.Length - 1
intMax = strNumbers(intLoop)
If intMax < strNumbers(intLoop) Then
intMax = strNumbers(intLoop)
End If
Next intLoop
lblMinMax.Text = intMax
End Sub
End Class
I've got the 15 random numbers in the listbox. I just cant figure out why my maximum code isnt working. Any tips would be awesome right now...
|
|
|
|
|
look again right here:
intMax = strNumbers(intLoop)
If intMax < strNumbers(intLoop) Then
|
|
|
|
|
Put intMax initialization outside for loop. BTW, you can use a List instead of array since it has a sort method. Then all you would need is to get first and last element.
|
|
|
|
|
It worked...to do the minimum I should only have to change the sign I guess. Thanks a lot
|
|
|
|
|
I also figured out that the way I generated the random numbers made the code mess up. I ended up having to use
strNumbers(intLoop) = Int((100 - 1 + 1) * Rnd()) + 1 to get the random numbers.
|
|
|
|
|
I'm trying to develop an addin for Outlook 2003 so that when a new calendar item is added, it fires an event that passes along the new calendar item's information to a web service. I already have the web service working, and I can get the addin to load in outlook. What I can't figure out is how to tie the adding of a new calendar event to a function. Any guidance would be appreciated.
|
|
|
|
|
Hi everybody,
I have a funny problem developing an Excel-AddIn with VB Express 2008, based on sample code. The funny thing is that it used to work (at this stage only installing a button), but after working on other projects and coming back it doesn't show the initial message anymore, i.e. the addin doesn't connect. In the other project I had played around with VB's IMessageFilter - so I'm a bit afraid that I keep blocking any Windows messages without knowing?
Maybe someone can have a quick look at the code or has an idea what else might go wrong now? Here's the code:
Option Explicit On
Imports Microsoft.Office.Core
Imports Extensibility
Public Class Connect
Implements IDTExtensibility2
Public ext_cm_Startup As ext_ConnectMode
Public ext_dm_HostShutdown As ext_DisconnectMode
Public edlcaption As String = ChrW(&H3B1) & ChrW(&H3A9) & "-ED&L"
Dim oHostApp As Object
Dim WithEvents MyButton As CommandBarButton
Private Sub IDTExtensibility2_OnConnection(ByVal Application As Object, ByVal ConnectMode As ext_ConnectMode, _
ByVal AddInInst As Object, ByRef custom As System.Array) _
Implements IDTExtensibility2.OnConnection
On Error Resume Next
' Set a reference to the host application...
oHostApp = Application
' If you aren't in startup, then manually call OnStartupComplete...
If (ConnectMode <> ext_cm_Startup) Then _
Call IDTExtensibility2_OnStartupComplete(custom)
End Sub
Private Sub IDTExtensibility2_OnStartupComplete(ByRef custom As System.Array) Implements IDTExtensibility2.OnStartupComplete
Dim oCommandBars As Microsoft.Office.Core.CommandBars
Dim oStandardBar As Microsoft.Office.Core.CommandBar
On Error Resume Next
' Set up a custom button on the "Standard" commandbar...
oCommandBars = oHostApp.CommandBars
If oCommandBars Is Nothing Then
' Outlook has the CommandBars collection on the Explorer object
oCommandBars = oHostApp.ActiveExplorer.CommandBars
End If
oStandardBar = oCommandBars.Item("Standard")
If oStandardBar Is Nothing Then
' Access names it's main toolbar Database
oStandardBar = oCommandBars.Item("Database")
End If
' In case the button was not deleted, use the exiting one...
MyButton = oStandardBar.Controls.Item(edlcaption)
If MyButton Is Nothing Then
MyButton = oStandardBar.Controls.Add(1)
With MyButton
.Caption = edlcaption
.Style = MsoButtonStyle.msoButtonCaption
.Tag = "EDL"
.OnAction = "!<MyCOMAddin.Connect>"
.Visible = True
End With
End If
' Display a simple message to know which application you started in...
MsgBox("Started in " & oHostApp.Name & ".")
oStandardBar = Nothing
oCommandBars = Nothing
End Sub
Private Sub IDTExtensibility2_OnDisconnection(ByVal RemoveMode As Extensibility.ext_DisconnectMode, ByRef custom As System.Array) _
Implements IDTExtensibility2.OnDisconnection
On Error Resume Next
If RemoveMode <> ext_dm_HostShutdown Then _
Call IDTExtensibility2_OnBeginShutdown(custom)
oHostApp = Nothing
End Sub
Private Sub IDTExtensibility2_OnBeginShutdown(ByRef custom As System.Array) Implements IDTExtensibility2.OnBeginShutdown
On Error Resume Next
' Notify the user you are shutting down, and delete the button...
MsgBox(String.Format("Der Button '{0}' wird gelöscht.", MyButton.Caption))
MyButton.Delete()
MyButton = Nothing
End Sub
Private Sub IDTExtensibility2_OnAddInsUpdate(ByRef custom As System.Array) Implements IDTExtensibility2.OnAddInsUpdate
'You do nothing if this is called, but you need to
'add a comment so Visual Basic properly implements the function...
End Sub
Private Sub MyButton_Click1(ByVal Ctrl As Microsoft.Office.Core.CommandBarButton, ByRef CancelDefault As Boolean) Handles MyButton.Click
MsgBox("Hier wird eine Aktion vom Add-In ausgeführt!")
End Sub
Private Function listFiles(ByVal dir As String) As List(Of String)
Return Nothing
End Function
End Class After compiling, I can successfully register the dll with regasm
regasm MyCOMAddIn.dll /codebase /tlb=MyCOMAddIn.tlb
Gacutil /if MyCOMAddIn.dll and into the registry
Windows Registry Editor Version 5.00
[HKEY_CURRENT_USER\Software\Microsoft\Office\Excel\Addins\MyComAddin.Connect]
"LoadBehavior"=dword:00000003
"CommandLineSafe"=dword:00000000
"FriendlyName"="MyComAddin Connect Class"
"Description"="MyComAddin Connect Class" Thank you for your time!
Mick
|
|
|
|
|
Have you tried a reboot?
Tosch
|
|
|
|
|
Hi tosch, thanks for having a look. Of course I have tried a reeboot - not only once. Do you think the code itself is ok? I have no idea where or for what to look...
|
|
|
|
|
This is just a wild guess, but as you said that previously this code worked, and then all of a sudden not anymore, have you checked addin and/or security and/or macro settings in Excel?
Depending on your Office version (I think) addins can be disabled, or blocked.
My advice is free, and you may get what you paid for.
|
|
|
|
|
Thank you Johan for your response. No guess can be wild enough, I know But: I have checked the security options in Excel. And, according to my limited knowledge, this should anyway be oblolete since the AddIn has been strongly typed and signed.
I am more or less afraid that with my above mentioned IMessageFilter experiments (this project had started a new instance of Excel) I might have changed some (registry?) settings ... anything like that, and prevent Excel from receiving messages. It would therefore help me a lot if you - or someone of you other probable readers - would tell me that
a) there's no such setting and it makes sense to recheck the project more detailed
or b) there's an error in my code (would I have deleted a line by accident?)
modified on Thursday, May 6, 2010 3:43 PM
|
|
|
|
|
I used Excels workbook_open event to create an instance of the MyCOMAddIn.Connect class. This seems to work, i.e. Excel doesn't complain! Also a function that I call from VBA gives back proper results.
So the "only" thing that doesn't seem to work is the creation of the button (see 'On_StartupComplete' in the code). So I just didn't realize that it's connecting Still the On_StartupComplete Event gives me trouble...
One more professional look at the code, please
|
|
|
|
|
I have never developed any addins for Office myself, so forgive me if this is a dumb question: The code looks like it is written in vb.net, so why do you use On Error Resume Next instead of the good old Try Catch block? Or if I misunderstood, and this is VBA, why not use On Error GoTo ErrorCatcher and first find out if your code isn't just throwing an error.
I mean, if it does, you'll never know with this code.
My advice is free, and you may get what you paid for.
|
|
|
|
|
Hi again Johan,
first things first: You are right that it's VB (.NET Express 2008). And actually that's what I did last night: Change to Try... Catch and hope for an error message, following the same thought you had. Unfortunately the ex-message that should report any error doesn't come, too The whole event just acts like it wouldn't be fired at all.
|
|
|
|
|
Hi all,
I'm finding it hard to work with file conversion in VB.Net. I already have the filesize in Bytes, but I need to display it as KB in a repeater.
The examples on google are too complex for what I need. Any help is appreciated.
Thanks.
|
|
|
|
|
KB = Bytes/1024, em, what else do you want to know? what exactly are you stuck with.
|
|
|
|
|
that is a dangerous idea; all files with sizes less than 1024B will show as 0KB, giving you the impression they are empty. Much better is rounding up like this:
sizeInKiloBytes=(sizeInBytes+1023)/1024;
which only shows zero when it really is zero.
|
|
|
|
|
I found to format it, but I'm still getting errors.. I will post update this when I work it out.
|
|
|
|
|
I was being overly simplistic to try and find out what exactly the problem is........i would hope that anybody who is working in IT knows the relationships between bits bytes k, m, g, t.....and on and on.
but yes agree with your point, i.e. never show zero unless it is actually zero.
|
|
|
|
|