From: Eugene S. <eug...@gm...> - 2012-10-05 13:42:26
|
> On Oct. 3, 2012, 7:12 p.m., Michel Ludwig wrote: > > Ship It! > > Eugene Shalygin wrote: > Dear Michel, > > with commit c2a2fe238f8054206cca63d209e480809196739b full parsing is there, but why did you remove files loading without Kate? Project loading is slow... What was wrong with that? > > Michel Ludwig wrote: > When KatePart loads a file, it additionally performs auto-detection of the encoding of the file. Thus, the only way of ensuring that Kile's loading of files will give the same results as the one of KatePart would actually be to use the file loading mechanism of KatePart, but that's currently not possible. > > Eugene Shalygin wrote: > Thank you for the reply and patch rework (almost complete rework:) If you would point me the problems before, I be happy to help you with that) > I see. Indeed I forgot about encodings. But as for me it is quite a large overhead --- creating of KatePart. I still want to solve this situation in one or another way. Because what do I see in calgrind drives me crazy: to load 19 documents Kate initializes myspell 231 times and (in the same time!) hunspell --- 215 times! It is more than 20 invocations of spellchecker initializations per document. This is close to to the point of absurdity. May Kile create so many views per document? If it is not the case, I believe this has to be solved. > > Right now I see the following possibilities: > > 1. Use, for instance, KEncodingProber to load document-less files. This can cover majority of encodings. Besides, there, on project loading, where document-less Infos are present, we need to get only structure of includes (or another resources inclusion). Can one expect that user provides localized versions of \include/\input commands? I understand that this way adds some inconsistency between different documents loading pathes, which is not good. > It seems to me that KDevelop is using this approach and I did not see any problems with it. However I do not use other than cyrillic single byte encodings (in addition to UTF8). > > 2. Change KatePart behaviour in order to add possibility to exclude loading config for the document (which can initialize spellchecking, load terminal and who knows what else...) > > 3. Sort out things in Kate that reads config. > > I would go for the first one, because it is simple for us, and seems to do less harm than benefit. > > > Eugene Shalygin wrote: > I would like to add that if one would move towards more sophisticated latex parser, then for sure it must be implemented without KatePart. Kind of vote for number 1 :) > > Michel Ludwig wrote: > Hehe, no, even an extended parser would operate on QStrings only ;) > > I had a closer look at KatePart's file loading code and it seems that if an encoding is preset, no auto-detection is attempted unless some kind of encoding problem occurs. As Kile saves the encodings of project items anyway, all that is necessary is to call 'setCodec' on the QTextStream. This should work in 99.999% of the cases. But can you put the file reading code in the doc manager as well? and use KIO::NetAccess for remote documents? > > Of course, the ultimate solution would be to add a method to KatePart that can read documents without setting up a full KateDocument... > > Eugene Shalygin wrote: > Yes, but not via KatePart --- one can obtain fil contants without it, can not one? > > Regarding loading: brilliant!! I'll prepare the patch against current master. > Do I understand correctly, that you want to call back doc manager for text loading? Maybe it would be better to add parseProjectItem() to parser manager? And we still need to implement encoding guessing for free files? Or to believe that we need to parse fast only project files? > > Michel Ludwig wrote: > No, the loading has to be done *only* for project items during the opening of the project. For stand-alone files and once a project item is open in the editor parsing should be done as it is now. > > We can add methods likde get/setDocumentContents to TextInfo that would allow to set the contents of the document as QStringList (if no document is present). In 'DocumentParserThread::addDocument' we can call 'getDocumentContents' on the TextInfo then, and 'setDocumentContents' can be called in 'loadItem' in the doc manager once the loading is complete. But yes, all the loading should be done inside the doc manager. > > > Eugene Shalygin wrote: > Okay, I've implemented this scheme. It was easy and fast. But a few questions comes up: > 1. Where to clean those m_documentContents? Logicaly it must be done in setDocument(). But this is cynical planning of concurency problems, because QStringList::clear() is not atomic. Guarding m_doc with a mutex would help, but that is too complicated. > 2. Who can clear document contents for Infos, that will not be loaded in an editor? > > Thus I decided to try the following. We add virtual encoding() property to TextInfo (or, perhaps, to Info?). This looks logical, since we have url() there, but as we found, url without encoding does not give access to a contents :). Then, we can return m_doc->encoding() if document is set. We make static QStringList Manager::loadFileContents(const Kurl& url, const QString& encoding). We add QStringList TextInfo::contents() which can either get them from m_doc, or call Manager::loadFileContents() if there is no document set. > > By doing in this way, we do not store files contents in memory longer, then they are needed and avoid race condition between getting and clearing m_documentContents. The most cumbersome place of this way is a change of Info* constructors to add url and encoding. > > I will test this for a few more hours and submit, if you agree. Do you? > > > Michel Ludwig wrote: > There are no problems regarding concurrency w.r.t. m_doc. Copies of QStringList passed to the LaTeX parser will be thread-safe copies - this is ensured by Qt, i.e. one can safely modify or even delete m_doc after the corresponding QStringList has been copied to the parser thread. > > What we should NOT do, however, is call 'Manager::loadFileContents' from TextInfo. The reason being that 'Manager::loadFileContents' might start an event loop (from KIO::NetAccess), and we want to reduce the number of "places" that can launch event loops as much as possible (they are evil!). On the other hand, it is fine that 'Manager::openProjectItem' can launch event loops as we are prepared for it and expect it to happen. So, please just return 'm_documentContents' in 'TextInfo::getDocumentContents' (or maybe even call it 'TextInfo::documentContents()' :) and use 'setDocumentContents' in 'openProjectItem'. > > Afterwards, 'm_documentContents' can possibly be cleared after the call to 'update' in 'openProjectItem', but please check that 'update' will indeed pass the contents to the parser. > > Regarding getting the encoding, you can use 'KileProjectItem::getEncoding()' and there is no need to use m_doc->encoding(), just get the document contents from m_doc if it is set (and as it is done now). Hmm.. Let me describe it once again. Suppose we are setting a document into TextInfo. setDocument(){ ... m_doc = newDoc; <-------------------------------------- thread 1 is here m_docuemntContents.clear(); ... } documentContents() { ... if (!m_doc) { <-------------------------------------- thread 2 is here return m_docuemntContents; } ... } Now QStringList::QStringList(const QStringList&) and QStringList::clear() are executing _simultaneously_. Bang! - Eugene ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/106567/#review19848 ----------------------------------------------------------- On Sept. 26, 2012, 9:12 p.m., Eugene Shalygin wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://git.reviewboard.kde.org/r/106567/ > ----------------------------------------------------------- > > (Updated Sept. 26, 2012, 9:12 p.m.) > > > Review request for Kile. > > > Description > ------- > > Kile shedules parsing of all projet items when it opens a project. But because Manager::m_projects does not contain currently loading project at that time, Manager can not resolve projet items to documents and parsing does not happen. > > This patch adds curently parsing project to m_projects collection before invoking projectOpenItem() and thus parsing goes. Then we crate Info* object for each project item (if it does not exist) and sneding it to parsing. > In the next step, in Manager::handleParsingComplete(), TextInfo object is determined via opened documents list, as I understood. To install results for all project items, the patch adds a second branch that determines TextInfo object via KileProjectItem::getInfo() and the item is accessed via Manager::itemFor() (that is why project must be added to m_projects before parsing) > > Also the patch changes Info* classes in a way that makes possible to parse files without loading them into KatePart, i.e. without Document object creation. This speeds up project loading. At least on my machine before most of the time was spent in spell checker initialization. We do not need a spell checker for parsing :). To parse a file we use a tiny helper function that loads file contents (getFileContents() in anonymous namespace, parserthread.cpp). > > Then it tries to fix a race between parsing and cleaning of Info object by separating the cleaning code into cleanInfo() method and calling it from installParserResults() > > With these changes completion works for all bib items and labels in project even if these files are not opened. > > > Diffs > ----- > > src/parser/latexparser.cpp ea0ae0b > src/parser/parserthread.h c41123b > src/parser/parserthread.cpp ad38819 > src/kiledocmanager.cpp 1d652c1 > src/documentinfo.cpp 2c3a3bb > src/documentinfo.h 27ad09f > > Diff: http://git.reviewboard.kde.org/r/106567/diff/ > > > Testing > ------- > > Manual testing > > > Thanks, > > Eugene Shalygin > > |