|
Hi Michael,Michael Sydney Balloni wrote: was hoping to get some direction based on my question as to how this tool might best be used i think that without more of an overview of the project, and summary details of its architecture, api's, and work-flow, its features, its limitations, etc., you may not get the responses that would be useful.
imho, JSchell's reaction, posted here, probably reflects the naturally skeptic attitude people with significant db experience will take tom such innovation.
People like me, without pro-level db experience, may compare what you are doing with Mehdi Gholam's RaptorDB here on CP: [^] a major project developed over years, with extensive performance testing/timing.
If your project has a limited range of functionality ... and, i understand what that is ... and, performance in its range of use-cases is faster, more efficient, easier to use ... that's fine with me !
«The mind is not a vessel to be filled but a fire to be kindled» Plutarch
|
|
|
|
|
|
Michael Sydney Balloni wrote: general premise
"A simple, easy to use, highly productive,"
My feedback is that I doubt that statement.
Mainstream databases have massive amounts of resources available to them. Productivity for the persisted data is often limited by the complexity of the requirements and how to fit that complexity into the generalities of the common persisted data solutions. However that complexity is what one allows one to craft solutions in the first place without doing a deep dive into how persisted data might solve be used to solve that.
Every 'new' persisted data store solution that I have seen introduced in the past 10 years or so has been done so to support a very specific type of data driven need and none of them are actually anything that could not at least be implemented by traditional mainstream persisted solutions (naturally one might suppose there are speed/cost differences.)
Certainly nothing in the above suggests what is the exact need that you think your solution will address? Does it attempt to replace enterprise Oracle installs? Or is it just another way to manage configuration information for a single app? Maybe it is supposed to compete with Elasticsearch because it is faster? Or something completely new that has only been identified recently by changing user driven needs?
|
|
|
|
|
Thanks for having a look, I really appreciate it.
Taking apart the premise...
- File-based is simpler than server-based.
- Four operations are easier to use than dozens.
- And something simpler and easier to use is more productive.
4db.net builds on SQLite, and provides the database functionality needed for basic applications. It's not fast, it's not sophisticated or robust, but if you only need those four statements, 4db.net is the way to go. To see what I'm talking about, check out this small cars-based sample:
4db.net/Program.cs at main · michaelsballoni/4db.net · GitHub
Imagine how much SQLite code that would take, let alone the server setup for MySQL or some other RDBMS.
Sometimes you need a yacht. But sometimes you just need a canoe. What do you think?
|
|
|
|
|
Michael Sydney Balloni wrote: Sometimes you need a yacht. But sometimes you just need a canoe. and, some people may need to understand what a paddle is, and why a canoe is a good design for a boat
«The mind is not a vessel to be filled but a fire to be kindled» Plutarch
|
|
|
|
|
Michael Sydney Balloni wrote: Imagine how much SQLite code that would take, let alone the server setup for MySQL or some other RDBMS.
Not sure you really understood my point though.
I work on large applications. So the amount of code and even management of the persisted store data systems is not significant because the applications are going to be large regardless.
And even midsize solutions usually have aspirations to be bigger and will likely need more functionality than provided.
So based on the requirements presented here this only works for very small systems. And for those systems any solution is probably going to be adequate.
So as I suggested before without a specifically identified niche I don't see the need for what you are suggesting.
|
|
|
|
|
Point taken. In .NET, coding directly against SQLite is easy and you get great functionality. No brainer there.
So I ported the .NET metastrings / 4db stuff to C++, where coding against the C SQLite library is a pain.
I wrote this:
4db: A Dynamic File-based NoSQL Database for C++
For a C++ programmer wanting basic record persistence and not much else, this is a good alternative to fprintf or hacking SQLite by hand. In that article, in its code, there are a couple wrapper classes for SQLite. Those might have more valuable than 4db. It's been a wild ride...
|
|
|
|
|
I have some folders having files, its count like 40Lac, 70Lac, 1Cr, 1.5cr. And filetype like .png, .xlsx, .txt, .msg, .ico, .jpg, .bmp etc...
Now i want to insert this filename into my database table with its size, but when i implement it take too much time to scanning the folder, and then after it throws memory exception.
can anyone please help me out here, how can i implement this scenario in a better manner.
how can insert faster in table ?
I am using C#.net with PostgreSQL database.
Ankur B. Patel
|
|
|
|
|
First of all, what is a lac and cr?
This is never going to be "fast", more important is that you do it correct. I assume your putting all your sh*t in a blob. Don't, txt should be archived as memo, so you can later use search form DB.
If you just want to archive names and their sizes, read the entire folders' contents and spawn some threads to save chuncks of that.
Bastard Programmer from Hell
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
|
Yah. So them like Americans with their pounds.
In international communication we use the SI system. If you can't, better learn
Bastard Programmer from Hell
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
Thanks but, I just correct here that forget about lac and cr (it's Lakh and Crore), keep in mind that you have millions of files in a folder and I just want to insert the files name and its size in bytes into the database.
how do I bulk insert? and also keep in mind that system memory doesn't go high.
please suggest to me, how I can do it quickly and securely.
|
|
|
|
|
Pipe a DIR listing to a text file. That'll give you the list you need; which you can then "substring" and load into a table of files names, etc.
Displays a list of files and subdirectories in a directory.
DIR [drive:][path][filename] [/A[[:]attributes]] [/B] [/C] [/D] [/L] [/N]
[/O[[:]sortorder]] [/P] [/Q] [/R] [/S] [/T[[:]timefield]] [/W] [/X] [/4]
[drive:][path][filename]
Specifies drive, directory, and/or files to list.
/A Displays files with specified attributes.
attributes D Directories R Read-only files
H Hidden files A Files ready for archiving
S System files I Not content indexed files
L Reparse Points O Offline files
- Prefix meaning not
/B Uses bare format (no heading information or summary).
/C Display the thousand separator in file sizes. This is the
default. Use /-C to disable display of separator.
/D Same as wide but files are list sorted by column.
/L Uses lowercase.
/N New long list format where filenames are on the far right.
/O List by files in sorted order.
sortorder N By name (alphabetic) S By size (smallest first)
E By extension (alphabetic) D By date/time (oldest first)
G Group directories first - Prefix to reverse order
/P Pauses after each screenful of information.
/Q Display the owner of the file.
/R Display alternate data streams of the file.
/S Displays files in specified directory and all subdirectories.
It was only in wine that he laid down no limit for himself, but he did not allow himself to be confused by it.
― Confucian Analects: Rules of Confucius about his food
|
|
|
|
|
If this is your idea of a backup mechanism, don't bother: it'll never be quick, it'll never be efficient, and it will always risk running out of memory.
Instead, think about using a "proper" backup system which archives the disk as sectors instead of files - it's a lot quicker, a lot more efficient, and much, much safer.
Remember, backups should be air-gapped for safety: DB's and suchlike are just files and as such are just as much at risk from ransomware and any other data (more so, in some cases as they are a "Prime target" for ransomware to exploit).
I use AOMEI Backupper - it has a free Standard version and it allows you to mount the backup images as a disk and retrieve individual files if necessary.
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
"Common sense is so rare these days, it should be classified as a super power" - Random T-shirt
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
Hi Ankur,
I'm going to assume your files are in a bunch of directories, not all in one, or mostly in a few, but leafy. I'd use the System.IO classes Directory and Path and FileInfo. You can use Directory.GetDirectories to get then directories, then loop over those directory paths calling Directory.GetFiles to get file paths to the files, then for each file path call Path.GetFilename to get the filename, and (new FileInfo(filePath)).Length to get the file length. I'd use a DB transaction per directory, and create an INSERT statement per file and execute it inside the transaction, and commit when you're done with the files in that directory.
Things not to do:
Don't get all files in all directories.
Don't build one SQL statement for all the files.
Hope this helps, -Michael
|
|
|
|
|
Ankur B. Patel wrote: can anyone please help me out here, how can i implement this scenario in a better manner.
Then easiest and best answer do not do that in the first place.
File systems, not databases, have existed for a long time and exist solely to manage files. What you have are files.
If you want to reference the files in the database then do just that. Keep the files on the file system and provide a reference (absolute or relative) to the location of the file.
You can keep meta data on the file in the database, such as name, size, type, etc.
Might also want to insure uniqueness also. You do that by finding out what makes one file always different from other. Usually some combination of name and location. And insure that is maintained.
Finally of course how will these files be used. For example if you expect a png to show up in a web page that has 10,000 unique visits a day, you certainly do not want to be pulling it out of a database.
|
|
|
|
|
I have this Linq-To-Sql query. I'm now going to introduce conditions that match the parameters:
public List<VendorsByProjectAndJobReportEntity> GetVendorsByProjectAndJobReportData(int appCompanyId,
List<int> projectIds,
List<int> jobIds,
List<PurchaseOrderType> purchaseOrderTypes)
{
using (var dc = GetDataContext())
{
var results = (from j in dc.Jobs
join ac in dc.AppCompanies on j.AppCompanyId equals ac.Id
join p in dc.Projects on j.ProjectId equals p.Id
join poh in dc.PurchaseOrderHeaders on j.Id equals poh.JobId
join cl in dc.CompanyLocations on poh.VendorId equals cl.Id
where ac.Id == appCompanyId &&
!j.DeletedDT.HasValue &&
!p.DeletedDT.HasValue &&
(poh.POStatus != (int)PurchaseOrderPOState.Cancelled &&
poh.POStatus != (int)PurchaseOrderPOState.Draft)
select new VendorsByProjectAndJobReportEntity
{
ProjectId = p.Id,
ProjectInfo = $"{p.ProjectNumber.ToString()} - {p.ProjectName}",
JobId = j.Id,
JobInfo = $"{j.JobNumber.ToString()} - {j.Phase}",
VendorId = cl.Id,
VendorName = cl.Location,
PurchaseOrderType = (PurchaseOrderType)poh.PurchaseOrderType,
PurchaseOrderTotal = poh.Total,
ReportTitle = $"{ac.CompanyName} Vendors by Project and Job"
}).ToList();
return results.OrderBy(x => x.ProjectInfo)
.ThenBy(x => x.JobInfo)
.ThenBy(x => x.PurchaseOrderTypeDesc)
.ThenBy(x => x.VendorName).ToList();
}
}
I want to replace the Where clause with a PredicateBuilder, so I start out like this:
var predicate = PredicateBuilder.New<VendorsByProjectAndJobReportEntity>();
predicate = predicate.And(x => !x.DeletedDT.HasValue);
First, I don't want either Project or Job records if they're deleted. Which record is 'x' referring to in the PredicateBuilder?
The only way I can see making this work is to add both a ProjectDeletedByDT and JobDeletedBtDT to the entity and assigning both of those dates to it.
Anyone have a better approach?
If it's not broken, fix it until it is.
Everything makes sense in someone's mind.
Ya can't fix stupid.
|
|
|
|
|
Kevin Marois wrote: Anyone have a better approach? Same advice as during SQL classes;
Don't bloody try to do all in one single query.
Make it multiple simple queries that are easy to debug.
Bastard Programmer from Hell
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
I'm attemping to get data for a dialog that has 2 comboboxes. I want to keep the UI from hanging, so I'm trying this:
WaitIndicatorVisibility = Visibility.Visible;
List<ProjectHeaderEntity> projects = null;
List<NavigationGroupEntity> jobs = null;
await Task.Run(() => { projects = AppCore.BizObject.GetProjectHeaders(AppCore.AppCompany.Id).ToList(); });
await Task.Run(() => { jobs = AppCore.BizObject.GetJobListHeaders(AppCore.AppCompany.Id).ToList(); });
var dialogVm = new VendorsByProjectAndJobReportViewModel(projects, jobs);
DialogResultEx result = DialogService.ShowDialog(dialogVm, typeof(MainWindowView));
This works fine. Both lists get populated and the dialog works fine.
But if I attemp to to use WhenAll to improve performance...
WaitIndicatorVisibility = Visibility.Visible;
List<ProjectHeaderEntity> projects = null;
List<NavigationGroupEntity> jobs = null;
List<Task> tasks = new List<Task>();
var t1 = Task.Run(() => { projects = AppCore.BizObject.GetProjectHeaders(AppCore.AppCompany.Id).ToList(); });
var t2 = Task.Run(() => { jobs = AppCore.BizObject.GetJobListHeaders(AppCore.AppCompany.Id).ToList(); });
tasks.Add(t1);
tasks.Add(t2);
await Task.WhenAll(tasks).ConfigureAwait(false);
var dialogVm = new VendorsByProjectAndJobReportViewModel(projects, jobs);
DialogResultEx result = DialogService.ShowDialog(dialogVm, typeof(MainWindowView));
Both lists get populated but when I attempt to show the dialog I get an exception "The calling thread must be STA, because many UI components require this.'"
If I remove the ConfigureAwait then everything works fine.
I've tried looking up ConfigureAwait. Can someone explain this is plain english?
If it's not broken, fix it until it is.
Everything makes sense in someone's mind.
Ya can't fix stupid.
|
|
|
|
|
ConfigureAwait(false) tells the code that you don't want to us the same "execution context" to continue running the method after the task has completed.
In a desktop application, when you start a task from the UI/dispatcher thread, running the method on the same execution context means running it on the UI thread. If you add .ConfigureAwait(false) , the rest of the method will not run on the UI thread unless the task completes synchronously.
Since you want to run the rest of the method on the UI thread, you need to drop the .ConfigureAwait(false) from your code.
I'd also be inclined to drop the closures. And you don't really need Task.WhenAll here at all.
Task<List<ProjectHeaderEntity>> projectsTask = Task.Run(() => AppCore.BizObject.GetProjectHeaders(AppCore.AppCompany.Id).ToList());
Task<List<NavigationGroupEntity>> jobsTask = Task.Run(() => AppCore.BizObject.GetJobListHeaders(AppCore.AppCompany.Id).ToList());
List<ProjectHeaderEntity> projects = await projectsTask;
List<NavigationGroupEntity> jobs = await jobsTask;
var dialogVm = new VendorsByProjectAndJobReportViewModel(projects, jobs);
DialogResultEx result = DialogService.ShowDialog(dialogVm, typeof(MainWindowView));
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
Richard Deeming wrote: ConfigureAwait(false) tells the code that you don't want to us the same "execution context" to continue running the method after the task has completed.
OK, I see. I misunderstood what it did.
Richard Deeming wrote: I'd also be inclined to drop the closures.
What's a 'closure'?
Richard Deeming wrote: And you don't really need Task.WhenAll here at all.
I used WhenAll because I thought that by using that then both the tasks would execute simultaneously. I thought that WITHOUT WhenAll, the first would run, then the other one.
If it's not broken, fix it until it is.
Everything makes sense in someone's mind.
Ya can't fix stupid.
|
|
|
|
|
Kevin Marois wrote: What's a 'closure'?
When an anonymous function / lambda method references local variables, those variables are hoisted into a compiler-generated class called a closure.
Given:
public void Foo()
{
List<ProjectHeaderEntity> projects = null;
await Task.Run(() => { projects = AppCore.BizObject.GetProjectHeaders(AppCore.AppCompany.Id).ToList(); });
Console.WriteLine(projects.Count);
} the compiler will generate something closer to:
private sealed class <>_SomeRandomGeneratedName
{
public List<ProjectHeaderEntity> projects;
public Action TheAction;
public void TheActualMethod()
{
this.projects = AppCore.BizObject.GetProjectHeaders(AppCore.AppCompany.Id).ToList();
}
}
public void Foo()
{
var closure = new <>_SomeRandomGeneratedName();
closure.TheAction = new Action(closure.TheActualMethod);
await Task.Run(closure.TheAction);
Console.WriteLine(closure.projects.Count);
}
By changing the code so that it no longer refers to the local variables, you can get rid of the closure class:
public void Foo()
{
List<ProjectHeaderEntity> projects = await Task.Run(() => AppCore.BizObject.GetProjectHeaders(AppCore.AppCompany.Id).ToList());
Console.WriteLine(projects.Count);
} should compile to something like:
private static Func<List<ProjectHeaderItem>> TheCachedDelegate;
private static List<ProjectHeaderEntity> TheActualMethod()
{
return AppCore.BizObject.GetProjectHeaders(AppCore.AppCompany.Id).ToList();
}
public void Foo()
{
if (TheCachedDelegate == null)
{
TheCachedDelegate = new Func<List<ProjectHeaderItem>>(TheActualMethod);
}
List<ProjectHeaderEntity> projects = await Task.Run(TheCachedDelegate);
Console.WriteLine(projects.Count);
} which has significantly fewer allocations, particularly when called multiple times.
Kevin Marois wrote: I thought that WITHOUT WhenAll, the first would run, then the other one.
The task starts running as soon as you call Task.Run . Your code waits for the task to complete when you await it. If you separate the two, you can do other things in between the task starting and waiting for the task to complete, including starting other tasks.
await Task.Run(() => ...);
await Task.Run(() => ...);
DoSomething();
Task a = Task.Run(() => ...);
Task b = Task.Run(() => ...);
await a;
await b;
DoSomething();
Task a = Task.Run(() => ...);
Task b = Task.Run(() => ...);
await Task.WhenAll(new[] { a, b });
DoSomething();
You generally need Task.WhenAll when you have an unknown number of tasks to wait for. With two tasks, where you want the return value, and they are returning different types, it's easier to just await each in turn. If you used Task.WhenAll , you'd either have to await the tasks again, or use their Result property, to get the return value from each.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
Wow, great info. I just learn some new things!
Thanks a bunch
If it's not broken, fix it until it is.
Everything makes sense in someone's mind.
Ya can't fix stupid.
|
|
|
|
|
In your response in the part about WhenAll, you say they are all running already, then the first waits, then the second waits.
I asked about WhenAll because I watched this video where at around 22:30 he converts a loop that runs tasks one at a time, to creating a List<task> and using WhenAll to wait for them all to finish. He said that they would all execute at the same time and therefore incrase performance a bit. So I figured that since I neeed BOTH calls to be done before I continued, why wait for one, then the other.
I'm asking because as I mentioned before I'm converting a synchronous WPF app to target an API, and in this app there are all kinds of backend calls grouped together, with each one calling the BL => DAL for some piece of data.
For example, the LoadJob method on the Job View makes multiple calls one after another to the backend for mutiple pieces of data when it's being loaded, and it's getting progressively slower. This view has an upper detail section, and subtabs below. Right now ALL of the data for the details AND all the tabs is retrieived using individual backend calls. I'm going to have the detail section load async on opening, then the subtabs will be lazy loaded. But again, there's a whole bunch of calls, so I thought if I used WhenAll, I could get parallel loading and reduce the drag on the system.
I'm curious about your thoughts on this.
Thank you.
If it's not broken, fix it until it is.
Everything makes sense in someone's mind.
Ya can't fix stupid.
modified 15-Oct-21 14:43pm.
|
|
|
|
|
If you have a large or unknown number of tasks, or you don't care about the value (if any) returned by the tasks, then Task.WhenAll is usually the simplest option. But it's not required to get multiple tasks running at the same time.
If you're just loading data and setting the properties on your view-models, you generally don't need to be running on the UI thread. If you're updating a collection, you might need to use BindingOperations.EnableCollectionSynchronization[^] to enable updates from a background thread; but most property changes on a view-model will just work from any thread.
But as you discovered, you will need to be running on the UI thread to show another view. Therefore, you probably want to split the job into multiple tasks: a top-level task which kicks off the loading tasks, awaits Task.WhenAll to wait for them to finish (without using ConfigureAwait ), and then displays the dialog; and multiple sub-tasks which load the data and update the view-model, which can use .ConfigureAwait(false) .
Eg:
private async Task LoadProjects(VendorsByProjectAndJobReportViewModel vm)
{
var projects = await Task.Run(() => AppCore.BizObject.GetProjectHeaders(AppCore.AppCompany.Id).ToList()).ConfigureAwait(false);
vm.Projects = projects;
}
private async Task LoadJobs(VendorsByProjectAndJobReportViewModel vm)
{
var jobs = await Task.Run(() => AppCore.BizObject.GetJobListHeaders(AppCore.AppCompany.Id).ToList()).ConfigureAwait(false);
vm.Jobs = jobs;
}
private async Task ShowVendorsByProjectAndJobReport()
{
WaitIndicatorVisibility = Visibility.Visible;
var dialogVm = new VendorsByProjectAndJobReportViewModel();
var tasks = new List<Task>
{
LoadProjects(dialogVm),
LoadJobs(dialogVm),
};
await Task.WhenAll(tasks);
DialogResultEx result = DialogService.ShowDialog(dialogVm, typeof(MainWindowView));
}
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|