Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.
We have very large source inputs that we plan on loading into TimeDoctor spanning over a large portion of time. Large source inputs = 10-20mb and a large time range would be something like 24hours. We have yet to test this so I have no concrete data, but I will if anyone is interested in hearing more. What I'm concerned with is the memory usage of TimeDoctor. If I were to load a file similar to what I mentioned above, does time doctor keep all of this in memory? I heard mention of a Database from a friend but from looking in the sources I don't see how this could be true. Can someone please shed a little light on this for me? Thanks a bunch.
If TimeDoctor doesn't use a database, we are thinking of modifying TimeDoctor to rely on the InputSource extensions anytime it needs to refresh data in its model. This would probably require some work to decouple data fetching code from the model and the view. We think a database would provide the best possible speed when loading a range of events and or event types. We also believe a database would provide the smallest on disk footprint as DBs usually have smart ways of storing their data. I'm envisioning an engineer adding a 10-20mb file on a daily bases, so the dataset has the potential to become enormous as well as time to search when filtering data across months.
If this would helpful to TimeDoctor as an available plugin let me know, I’d prefer to work with the developers. We would design the Database support as a plugin that operates through the extension mechanism which would provide us with a decoupled system.
We did use large input files (10 MB) without a problem. There was one PR raised on a file of 20 MB, see bug #1646857
Raising the VM heap memory (TimeDoctor.exe -vmargs -Xmx256m) fixed it, and performance remains acceptable.
There's no generic database component used, TD has a so called 'model' that implements the 'database'. It has been carefully designed to optimal performance and keeps things simple. I've no experience with other generic database components, not sure what the performance impact would be.
> of storing their data. I'm envisioning an engineer adding a 10-20mb file on
> a daily bases, so the dataset has the potential to become enormous as well as
> time to search when filtering data across months.
TimeDoctor was never used to do such kind of long time logging. Handling a single or multiple 10-20 mb files should not be a problem, but if they are all combined into a single file it will be a problem.
Personally I think there's little use in visualizing such combined, very large traces. Maybe a seperate database for managing sets of traces and support for browsing, filtering etc. could be useful.
>> Raising the VM heap memory ....
I'm aware of this kind of fix, but this isn’t feasible once the memory on the system runs out. I don't think a model of having our customers upping the memory every few days/weeks as the log files grows is an acceptable model. Database performance like most things is dependent on how it is utilized, but we have used simple DB to give remarkable performance increases over our past tools. This is mainly due to how we create the database and when we make our queries.
>> TimeDoctor was never used to do such kind of long time logging.
I'm mainly trying to illustrate the possibilities, but it's still possible we'll see 100mb logs even over much shorter periods of time. Mainly, I'm preoptimizing for a looming issue that sounds like we will have to create a solution in the near future. At that time I'll have a lot more detail and specifics about the why's and what. Thanks for your assistance.
I regularly set the binary capture buffer for TimeDoctor to 12M. My largest time bottleneck is downloading this from the target over JTAG. This binary file regularly expands to 50M or more. Here we are looking at a couple of minutes of excecution data from the media processor. While I logged a PR on the Open Source version of TimeDoctor to support these files, increasing the VM size seems to solve the problem.
What I see in your thread is a wish or intent to do something very different with TimeDoctor. Maybe you could explain more? I am capturing and analyzing the behavior of a media processor with 12-100 tasks and a comparable number of semaphores and queues. the system switches tasks ~1000 times per second. This is the environment TimeDoctor grew in. How does that compare to what you want to do?