From: Colin P. A. <co...@co...> - 2008-01-25 14:23:08
|
Classes RX_PCRE_MATCHER and RX_PCRE_BYTE_CODE_CONSTANTS violate the single choice principle as they both contain giant inspect statements on the operation code. And the list of operation codes is defined in yet a third class (RX_PCRE_BYTE_CODE_CONSTANTS). The pure OO way would be to have a class for the concept of the machine operation and descendant class for each operation. But as these classes can't be expanded, there is cost associated with this, compared to using a 32-bit integer to represent instructions (which is a resonable model of a 32-bit microprocessor instruction - that is - its the model an assembler programmer has). I have to decide which way to go for the Unicode engine I am working on. Any opinions one way or the other? -- Colin Adams Preston Lancashire |
From: CRISMER Paul-G. <Pau...@gr...> - 2008-01-25 14:53:12
|
Hello Colin, What is the "cost" you are ready to pay for? - OO / Single choice :=0D cost of non-expandedness + dynamic binding (time) benefit of readability (development effort) - Integer (instruction number) + giant inspect cost - readability (development effort) benefit - run-time efficiency (time) The only way to know the difference between both solutions is to write them both and measure the run-time cost. - A mixed solution would be the following * a class INSTRUCTION and its descendants model your instruction set * at startup you fill an instruction table with one instruction object * The program is a sequence of integers that refer to the appropriate instruction in the instruction table * execution is something like this : instructions.item (op_code).execute Hope this helps. My personal taste is the following : - first model a well designed (OO) solution and make it work - If measurable performance problems arise, then optimize. Best regards, Paul G. Crismer -----Original Message----- From: gob...@li... [mailto:gob...@li...] On Behalf Of Colin Paul Adams Sent: vendredi 25 janvier 2008 15:22 To: gob...@li... Subject: [gobo-eiffel-develop] Object-oriented byte-codes for regularexpressions? Classes RX_PCRE_MATCHER and RX_PCRE_BYTE_CODE_CONSTANTS violate the single choice principle as they both contain giant inspect statements on the operation code. And the list of operation codes is defined in yet a third class (RX_PCRE_BYTE_CODE_CONSTANTS). The pure OO way would be to have a class for the concept of the machine operation and descendant class for each operation. But as these classes can't be expanded, there is cost associated with this, compared to using a 32-bit integer to represent instructions (which is a resonable model of a 32-bit microprocessor instruction - that is - its the model an assembler programmer has). I have to decide which way to go for the Unicode engine I am working on. Any opinions one way or the other? -- Colin Adams Preston Lancashire ------------------------------------------------------------------------ - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ gobo-eiffel-develop mailing list gob...@li... https://lists.sourceforge.net/lists/listinfo/gobo-eiffel-develop ***** Disclaimer ***** http://www.groupes.be/1_mail-disclaimer.htm |
From: Eric B. <er...@go...> - 2008-01-25 16:12:38
|
CRISMER Paul-Georges wrote: > What is the "cost" you are ready to pay for? > > - OO / Single choice : > cost of non-expandedness + dynamic binding (time) > benefit of readability (development effort) > > - Integer (instruction number) + giant inspect > cost - readability (development effort) > benefit - run-time efficiency (time) Assuming that from the client point of view they have the same interface (only the implementation differs), as a client of the library I'm only concerned in speed and memory usage. And specially in case of regexp. When there is a trade-off to be made, the client should be the winner. That's what happen for the Eiffel language itself: put the burden on the compiler writers, not on the language users. Now you might say that compiler writers should make it so that the cost of non-expandedness + dynamic binding should not be noticeable compared to giant inspect ;-) -- Eric Bezault mailto:er...@go... http://www.gobosoft.com |
From: Colin P. A. <co...@co...> - 2008-01-25 16:36:39
|
>>>>> "Eric" == Eric Bezault <er...@go...> writes: Eric> Assuming that from the client point of view they have the Eric> same interface (only the implementation differs), as a Eric> client of the library I'm only concerned in speed and memory Eric> usage. So I should wrap a C library then? Eric> Now you might say that compiler writers should make Eric> it so that the cost of non-expandedness + dynamic binding Eric> should not be noticeable compared to giant inspect ;-) I'd rather say allow ineritance of expanded types. What are the issues apart from space layout (which could be solved by forbidding conforming inheritance if attributes are added, or any function/attribute redefinition, and use non-conforming inheritance syntax for these cases)? -- Colin Adams Preston Lancashire |
From: Emmanuel S. [ES] <ma...@ei...> - 2008-01-25 16:42:08
|
> So I should wrap a C library then? I would not agree there, everything should be done in Eiffel because in the long term it is better. We can achieve very good performance with Eiffel when things are written with performance in mind. Manu |
From: Colin P. A. <co...@co...> - 2008-01-25 17:27:43
|
>>>>> "Eric" == Eric Bezault <er...@go...> writes: >> So I should wrap a C library then? I wasn't actually serious about that, by the way (I don't actually know of an available one, for a start, and I'm in favour of everything in pure Eiffel). Eric> What I had in mind is more something like EiffelParse Eric> vs. geyacc. Can you expand on that statement please? -- Colin Adams Preston Lancashire |
From: Eric B. <er...@go...> - 2008-01-25 18:34:30
|
Colin Paul Adams wrote: >>>>>> "Eric" == Eric Bezault <er...@go...> writes: > > >> So I should wrap a C library then? > > I wasn't actually serious about that Yes, I knew that you would prefer assembly code ;-) > > Eric> What I had in mind is more something like EiffelParse > Eric> vs. geyacc. > > Can you expand on that statement please? I don't know the current status, but 10 years ago using EiffelParse to write parsers (yooc was there to help) produced very slow parsers. EiffelParse has a very nice object-oriented design for the parser implementation. I don't know if it's related on not, but it's slow. On the other hand, geyacc was written to produce ugly non-object-oriented parsers, based on zillions of integers put in tables or used in inspect statement. The resulting parser looks more or less what the parser would look like if generated by yacc in C, but with an Eiffel syntax. I don't remember the exact results of the benchmarks, but parsers generated by geyacc were much faster than EiffelParse. Of course the geyacc solution is usable only if the clients of the generated parsers don't have to dive into the code of the parsers themselves. We should see it as a black-box. I think that a regexp library has the same criteria. That's typically something that clients want to be fast. And clients see it as a black-box (not that many people decide to dive into the code of regexp implementation to understand how it works). -- Eric Bezault mailto:er...@go... http://www.gobosoft.com |
From: Eric B. <er...@go...> - 2008-01-25 17:22:39
|
Colin Paul Adams wrote: >>>>>> "Eric" == Eric Bezault <er...@go...> writes: > > Eric> Assuming that from the client point of view they have the > Eric> same interface (only the implementation differs), as a > Eric> client of the library I'm only concerned in speed and memory > Eric> usage. > > So I should wrap a C library then? What I had in mind is more something like EiffelParse vs. geyacc. -- Eric Bezault mailto:er...@go... http://www.gobosoft.com |