From: Denys Rtveliashvili <rtvd@ma...>  20070817 18:11:06

>>>> The problem is not that the VOPs don't exist. The problem is that >>>> SBCL always considers doing arithmetic operations on fixnum (tagged) >>>> values preferable to doing arithmetic operations on untagged values >>>> unless it absolutely has no choice. Changing this requires major >>>> surgery on the compiler backend. >>>> >> Nathan: do you have a sketch (never mind how handwavy) of how to do this, >> or whatever idea you have for improving the representation selection? >> > > I haven't thought about this in a while. My last idea was to unify > the VOPs fixnums/signed/unsigned numbers so that, e.g., you'd have a > single VOP for + which has something like: > > (:types (fixnum fixnum fixnum) (signednum signednum signednum) > (unsignednum unsignednum unsignednum) > ;; something new that we could thow into the mix on x86oids > (fixnum unsignednum fixnum) ...) > > and then representation selection is responsible for determining some > optimal assignment that satisfies the constraints of defs and uses, > inserting the necessary coercions. I'm not so sure this is a > tractable approach; it looks vaguely like a an ugly graphcoloring > problem and I don't see any great heuristics for it. > > I also didn't think about having to deal with all the floatingpoint > VOPsthey'd have to be integrated into the giant + (or whatever) VOP. > Ugh. > > It might be possibleand this is even more of a handwavy idea than > the aboveto assign representations *first* and then drive the VOP > selection from that. This might entail choosing particular VOPs up > frontfor instance, we only have one VOP to do ub8array reffing, so > that's a known quantity. And then iteratively growing the solution > from there ("well, that means this index variable wants to be an > unsignednum..."). Even if backtracking is required ("hmmm, this > variable should be an unsignednum for use #1, but these other two > uses want taggednums, so it should really be a taggednum..."), it > can probably be done fairly efficiently. And the IR2 phase is not > where the bulk of the compilation time goes in any event. > > Another option would be to do what we're currently doing, but have a > "representation optimization" phase that runs after representation > selection to do some fixingup of dumb decisions. I have even less > idea how this would work. > > Other ideas...? I feel like I'm rambling a little, but you explicitly > asked for handwavyness. :) > > I keep meaning to look at some of the papers from the functional > programming community on what they call representation > selectionwhich seems somewhat close to the problem we're attacking > herebut I haven't found the time to sit down and decipher > everything. > Hi Nathan, I'm trying to understand this task better. Please, let me know if I am not right about my assumptions. I assume that the scope of optimization is a single function (possibly, with some other functions inlined), and that the function is a list of consequent operations which can map to a variety of VOPs. Each VOP has a number of related values. And each of these values is typed. So, it is necessary to select the VOPs keeping in mind that they need to have their values/variables matched. Then, I assume that the values which are received by the function and returned by it are tagged. This assumption is a minor point and I am probably wrong about it. The solution is a sequence of VOPs, mapped from the given operations that their related values match and the overall cost of the chain is minimal. Another assumption is that there is no automatic conversion between possible representations of a value If it, the scheme will be a bit different but let's not consider it now. So, we have n operations and m values. As long as available VOPs can have different types of incoming and outgoing values, each of the values can have a limited set of possible types. Let's say that the mth value has Vm possible types. In this case, the number of possible solutions is not larger than V1*V2*..*Vm. This is probably a big number. I guess it could be close to 4^m for VOPs related to fixnums only. Also, each operation can be represented as a number of VOPs. If we say that the nth operation has Om mappings to VOPs then we can also say that the number of possible solutions is not larger than O1*O2*..*On. This is a very big number as well. However, in reality no one will be willing to check all the possibilities. And the chosen VOP influences the types of the values. So, here are some heuristics and optimizations which might be useful: 1. For each step when a VOP has to be selected the amount of possible options is narrower than just all VOPs related to the operation. Only VOPs which can accept values of types predefined by the steps before can be used. This makes the search narrower, and I guess the invalid VOPs can be filtered out in liner time. 2. Then, for each value involved, I propose to estimate a "costs" of each of its types as a simple sum of the cost of all VOPs referencing the value of the same type. 3. An inevitable step is choosing the sequence of VOPs. It is necessary to try all possible variants (ideally). The task is computationally heavy, and on each step it is necessary to estimate the cost of the sequence and compare it with previous possibilities. Here I propose to chose VOPs based on the "cost" of the related values and starting from those with smaller cost. In case the cost is the same, a VOP with the smaller "cost" should be chosen first. Of course, the order of selection for VOPs should be determined beforehand. The search can be artificially limited by a predefined number of tries (100, for instance), just to evade problems on long sequences. 4. Another possible way to optimize this would be splitting the sequence of VOPs into looselycoupled chunks. It will be much easier to estimate costs of different possibilities in each of the chunks for each possible set of types of the values shared between the chunks. Regards, Denis Rtveliashvili 
From: Alexander Kjeldaas <alexander.kjeldaas@gm...>  20070818 19:50:07

Cutting costly representations shouldn't be that hard. If the cost of converting from representation X to representation Y is C_xy, then for any point in a chain of VOPs, you can cut the search if the cost of using X at this point and forward is larger than the cost of using Y + C_xy. Alexander On 8/17/07, Denys Rtveliashvili <rtvd@...> wrote: > > >>>> The problem is not that the VOPs don't exist. The problem is that > >>>> SBCL always considers doing arithmetic operations on fixnum (tagged) > >>>> values preferable to doing arithmetic operations on untagged values > >>>> unless it absolutely has no choice. Changing this requires major > >>>> surgery on the compiler backend. > >>>> > >> Nathan: do you have a sketch (never mind how handwavy) of how to do this, > >> or whatever idea you have for improving the representation selection? > >> > > > > I haven't thought about this in a while. My last idea was to unify > > the VOPs fixnums/signed/unsigned numbers so that, e.g., you'd have a > > single VOP for + which has something like: > > > > (:types (fixnum fixnum fixnum) (signednum signednum signednum) > > (unsignednum unsignednum unsignednum) > > ;; something new that we could thow into the mix on x86oids > > (fixnum unsignednum fixnum) ...) > > > > and then representation selection is responsible for determining some > > optimal assignment that satisfies the constraints of defs and uses, > > inserting the necessary coercions. I'm not so sure this is a > > tractable approach; it looks vaguely like a an ugly graphcoloring > > problem and I don't see any great heuristics for it. > > > > I also didn't think about having to deal with all the floatingpoint > > VOPsthey'd have to be integrated into the giant + (or whatever) VOP. > > Ugh. > > > > It might be possibleand this is even more of a handwavy idea than > > the aboveto assign representations *first* and then drive the VOP > > selection from that. This might entail choosing particular VOPs up > > frontfor instance, we only have one VOP to do ub8array reffing, so > > that's a known quantity. And then iteratively growing the solution > > from there ("well, that means this index variable wants to be an > > unsignednum..."). Even if backtracking is required ("hmmm, this > > variable should be an unsignednum for use #1, but these other two > > uses want taggednums, so it should really be a taggednum..."), it > > can probably be done fairly efficiently. And the IR2 phase is not > > where the bulk of the compilation time goes in any event. > > > > Another option would be to do what we're currently doing, but have a > > "representation optimization" phase that runs after representation > > selection to do some fixingup of dumb decisions. I have even less > > idea how this would work. > > > > Other ideas...? I feel like I'm rambling a little, but you explicitly > > asked for handwavyness. :) > > > > I keep meaning to look at some of the papers from the functional > > programming community on what they call representation > > selectionwhich seems somewhat close to the problem we're attacking > > herebut I haven't found the time to sit down and decipher > > everything. > > > Hi Nathan, > > I'm trying to understand this task better. Please, let me know if I am > not right about my assumptions. > > I assume that the scope of optimization is a single function (possibly, > with some other functions inlined), and that the function is a list of > consequent operations which can map to a variety of VOPs. > Each VOP has a number of related values. And each of these values is > typed. So, it is necessary to select the VOPs keeping in mind that they > need to have their values/variables matched. > Then, I assume that the values which are received by the function and > returned by it are tagged. This assumption is a minor point and I am > probably wrong about it. > > The solution is a sequence of VOPs, mapped from the given operations > that their related values match and the overall cost of the chain is > minimal. > > Another assumption is that there is no automatic conversion between > possible representations of a value If it, the scheme will be a bit > different but let's not consider it now. > > So, we have n operations and m values. As long as available VOPs can > have different types of incoming and outgoing values, each of the values > can have a limited set of possible types. Let's say that the mth value > has Vm possible types. > > In this case, the number of possible solutions is not larger than > V1*V2*..*Vm. This is probably a big number. I guess it could be close to > 4^m for VOPs related to fixnums only. > > Also, each operation can be represented as a number of VOPs. If we say > that the nth operation has Om mappings to VOPs then we can also say > that the number of possible solutions is not larger than O1*O2*..*On. > This is a very big number as well. > > However, in reality no one will be willing to check all the > possibilities. And the chosen VOP influences the types of the values. > > So, here are some heuristics and optimizations which might be useful: > 1. For each step when a VOP has to be selected the amount of possible > options is narrower than just all VOPs related to the operation. Only > VOPs which can accept values of types predefined by the steps before can > be used. This makes the search narrower, and I guess the invalid VOPs > can be filtered out in liner time. > > 2. Then, for each value involved, I propose to estimate a "costs" of > each of its types as a simple sum of the cost of all VOPs referencing > the value of the same type. > > 3. An inevitable step is choosing the sequence of VOPs. It is necessary > to try all possible variants (ideally). The task is computationally > heavy, and on each step it is necessary to estimate the cost of the > sequence and compare it with previous possibilities. Here I propose to > chose VOPs based on the "cost" of the related values and starting from > those with smaller cost. In case the cost is the same, a VOP with the > smaller "cost" should be chosen first. Of course, the order of selection > for VOPs should be determined beforehand. The search can be artificially > limited by a predefined number of tries (100, for instance), just to > evade problems on long sequences. > > 4. Another possible way to optimize this would be splitting the sequence > of VOPs into looselycoupled chunks. It will be much easier to estimate > costs of different possibilities in each of the chunks for each possible > set of types of the values shared between the chunks. > > Regards, > Denis Rtveliashvili > >  > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Sbcldevel mailing list > Sbcldevel@... > https://lists.sourceforge.net/lists/listinfo/sbcldevel > 
From: Denys Rtveliashvili <rtvd@ma...>  20070819 07:26:20

> Cutting costly representations shouldn't be that hard. > > If the cost of converting from representation X to representation Y is > C_xy, then for any point in a chain of VOPs, you can cut the search if > the cost of using X at this point and forward is larger than the cost > of using Y + C_xy. > > Alexander > Alexander, It is not that easy too. Choosing a "cheaper" representation for a value in such manner you also choose the related VOPs. And it could be that this chose implies narrowing a choice for other values dependent through these VOPs. So, making an early decision could make the algorithm be blind, couldn't it? Regards, Denys Rtveliashvili 