Thursday, September 15, 2011

The dark depths of this beast.



First a warning for those with heart conditions, epilepsy, or are pregnant; this post contains code snippets that have maimed the hale and healthy. YOU'VE BEEN WARNED!

One of the things I noticed while doing the last bit of work is how often the core part of the compiler iterates over the same damned thing. It needs to take a sequence of tokens and figure out what order of execution is necessary so that they 'make sense'. It's the core idea of the language, and making it work well, reliably, and fast/scalably is going to be key to the thing being viable. And frankly, its the sort of thing that the experimental language is there to experiment with.

What I was doing was simply iterating through the collection of tokens, if I found one to reduce, I did the reduction and then via recursion worked on the now smaller collection. Eventually I'd try everything and it'd fail, or I'd find what I was looking for and exit out early.

That though starts at the beginning each time. And it doesn't take any sort of type info into account when reducing things to see if we're "closer" to our answer. And it provides poor support for partial consumption, like the implicit feature I discussed earlier (this was actually the downfall of the previous iteration).

So I spend the past few days thinking and tonight redoing that core bit of code. And I now have the most horrible C# I've ever seen:

 using System;  
 using System.Collections.Generic;  
 using System.Linq;  
 using System.Text;  
 using Tangent.Lang.Compilation.IntermediateForm;  
 using System.CodeDom.Compiler;  
 using System.Threading.Tasks;  
 using System.Collections.Concurrent;  
   
 namespace Tangent.Lang.Compilation {  
   public partial class Compiler {  
     private IEnumerable<List<List<Element>>> UnflattenElements(List<Element> flatElements, Type targetType, List<Type> scope, CompilerErrorCollection errors, ConsumeDirection consumeDirection) {  
   
       System.Diagnostics.Debug.WriteLine("[{0}] <- {1}", targetType, string.Join(":", flatElements));  
   
       if (targetType == Type.ExecutionToken) {  
         yield return new List<List<Element>>() { flatElements };  
         yield break;  
       }  
   
       if (flatElements.Count == 0) {  
         yield break;  
       }  
   
       Element test = null;  
       if (consumeDirection == ConsumeDirection.Right) {  
         test = flatElements.First();  
       } else {  
         test = flatElements.Last();  
       }  
   
       if (targetType.IsExplicitIdentifier) {  
         if (!string.IsNullOrWhiteSpace(test.Identifier) && targetType == LiteralTypeFor(test.Identifier)) {  
           List<Element> rework = new List<Element>(flatElements);  
           rework[consumeDirection == ConsumeDirection.Right ? 0 : flatElements.Count - 1] = new IdentifierLiteral(test.Identifier, LiteralTypeFor(test.Identifier));  
           yield return new List<List<Element>>() { rework };  
           yield break;  
         }  
       }  
   
       if (flatElements.Count == 1) {  
         if (flatElements[0].EffectiveType != null && flatElements[0].EffectiveType.IsSubTypeOf(targetType)) {  
           yield return new List<List<Element>>() { flatElements };  
           yield break;  
         }  
       }  
   
       if (test.EffectiveType != null && test.EffectiveType.IsSubTypeOf(targetType)) {  
         yield return new List<List<Element>>() { flatElements };  
       }  
   
       int start = consumeDirection == ConsumeDirection.Right ? 0 : flatElements.Count - 1;  
       int end = consumeDirection == ConsumeDirection.Right ? flatElements.Count : -1;  
       int step = consumeDirection == ConsumeDirection.Right ? 1 : -1;  
       for (int i = start; i != end; i += step) {  
         IEnumerable<List<MethodDefinition>> possibilityStack = null;  
         List<Element> currentResolutionState = null;  
         if (flatElements[i] is ReResolveElement) {  
           possibilityStack = ResolveIdentifier(flatElements[i].Identifier, scope, errors)  
             .Signal(elements => currentResolutionState=elements)  
             .Select(elements => elements.Select(element => element.EffectiveType.ReductionRules[0]).ToList());  
   
         } else if (flatElements[i].EffectiveType.ReductionRules.Any()) {  
           possibilityStack = new List<List<MethodDefinition>>() { new List<MethodDefinition>(flatElements[i].EffectiveType.ReductionRules) };  
         }  
   
         if (possibilityStack != null) {  
           foreach (var possibles in possibilityStack) {  
   
             while (possibles.Any()) {  
               HashSet<MethodDefinition> best = HighestPriorityMethods(possibles, consumeDirection == ConsumeDirection.Right ? ConsumeDirection.Left : ConsumeDirection.Right);  
               possibles.RemoveAll(def => best.Contains(def));  
               List<Element> paramElements;  
   
               if (best.First().Direction == ConsumeDirection.Right) {  
                 if (i == flatElements.Count - 1) {  
                   paramElements = new List<Element>();  
                 } else {  
                   paramElements = new List<Element>(flatElements.GetRange(i + 1, flatElements.Count - i - 1));  
                 }  
               } else {  
                 if (i == 0) {  
                   paramElements = new List<Element>();  
                 } else {  
                   paramElements = new List<Element>(flatElements.GetRange(0, i));  
                 }  
               }  
   
               Dictionary<MethodDefinition, IEnumerator<List<List<Element>>>> possibleParamPermutations = new Dictionary<MethodDefinition, IEnumerator<List<List<Element>>>>();  
               foreach (var entry in best) {  
                 possibleParamPermutations[entry] = UnflattenElements(paramElements, entry.Consumes, scope, errors, entry.Direction).GetEnumerator();  
               }  
   
               while (possibleParamPermutations.Any()) {  
                 bool permutationsLeft = false;  
                 do {  
                   permutationsLeft = false;  
                   List<List<Element>> passback = new List<List<Element>>();  
                   Dictionary<MethodDefinition, IEnumerator<List<List<Element>>>> possibleSolutions = new Dictionary<MethodDefinition, IEnumerator<List<List<Element>>>>();  
   
                   foreach (var entry in best.Where(def => possibleParamPermutations.ContainsKey(def))) {  
                       
                     if (possibleParamPermutations[entry].MoveNext()) {  
                       permutationsLeft = true;  
                       foreach (var paramPermutation in possibleParamPermutations[entry].Current) {  
                         possibleSolutions[entry] = UnflattenElements(Reduce(flatElements, i, paramPermutation, entry, currentResolutionState), targetType, scope, errors, consumeDirection).GetEnumerator();  
                         if (!possibleSolutions[entry].MoveNext()) {  
                           possibleSolutions.Remove(entry);  
                         }  
                       }  
                     } else {  
                       // no op; eventually permutationsLeft will remain false.  
                       possibleParamPermutations.Remove(entry);  
                     }  
                   }  
   
                   if (possibleSolutions.Any()) {  
                     List<MethodDefinition> solutionDefs = new List<MethodDefinition>(possibleSolutions.Keys);  
                     while (solutionDefs.Any()) {  
                       HashSet<MethodDefinition> bestSolutions = HighestPriorityMethods(solutionDefs, consumeDirection == ConsumeDirection.Right ? ConsumeDirection.Left : ConsumeDirection.Right);  
                       solutionDefs.RemoveAll(def => bestSolutions.Contains(def));  
   
                       List<MethodDefinition> cleanup = new List<MethodDefinition>();  
                       while (bestSolutions.Any()) {  
                           
                         foreach (var bestSol in bestSolutions) {  
                           System.Diagnostics.Debug.WriteLine(string.Format("({2}) {0} : [{1}]", bestSol, string.Join(" =||= ", possibleSolutions[bestSol].Current.Select(line => string.Join(":", line))), flatElements.Count));  
                           passback.AddRange(possibleSolutions[bestSol].Current);  
                           if (!possibleSolutions[bestSol].MoveNext()) {  
                             cleanup.Add(bestSol);    
                           }  
                         }  
   
                         bestSolutions.RemoveWhere(key=>cleanup.Contains(key));  
                         cleanup.Clear();  
                         yield return passback;  
                       }  
                     }  
                   }  
                 } while (permutationsLeft == true);  
               }  
             }  
           }  
         }  
       }  
     }  
   
     private List<Element> Reduce(List<Element> flatElements, int ix, List<Element> paramPermutation, MethodDefinition signature, List<Element> currentResolutionState) {  
       List<Element> result = new List<Element>();  
       Func<int, Element> getReductionElement = (i) => {  
         if (flatElements[i] is ReResolveElement) {  
           if (currentResolutionState == null) {  
             throw new ArgumentNullException();  
           }  
   
           return currentResolutionState.Where(e => e.EffectiveType.ReductionRules[0] == signature).First();  
         } else {  
           return flatElements[i];  
         }  
       };  
         
       if (signature.Consumes == Type.ExecutionToken) {  
   
         if (ix != 0) {  
           result.AddRange(flatElements.GetRange(0, ix));  
         }  
   
         result.Add(new ReductionElement(getReductionElement(ix), new Literal("", Type.ExecutionToken, null), signature));  
   
         if (ix + 1 != flatElements.Count) {  
           result.AddRange(flatElements.GetRange(ix + 1, flatElements.Count - ix - 1));  
         }  
   
         return result;  
       }  
   
       if (signature.Direction == ConsumeDirection.Right) {  
         if (ix != 0) {  
           result.AddRange(flatElements.GetRange(0, ix));  
         }  
   
         if (signature.Consumes.IsExplicitIdentifier) {  
           result.Add(new ReductionElement(getReductionElement(ix), new IdentifierLiteral(paramPermutation[0].Identifier, LiteralTypeFor(paramPermutation[0].Identifier)), signature));  
         } else {  
           result.Add(new ReductionElement(getReductionElement(ix), paramPermutation[0], signature));  
         }  
   
         if (paramPermutation.Count != 1) {  
           result.AddRange(paramPermutation.GetRange(1, paramPermutation.Count - 1));  
         }  
   
         return result;  
       }  
   
       if (signature.Direction == ConsumeDirection.Left) {  
         if (paramPermutation.Count != 1) {  
           result.AddRange(paramPermutation.GetRange(0, paramPermutation.Count - 1));  
         }  
   
         if (signature.Consumes.IsExplicitIdentifier) {  
           result.Add(new ReductionElement(getReductionElement(ix), new IdentifierLiteral(paramPermutation.Last().Identifier, LiteralTypeFor(paramPermutation.Last().Identifier)), signature));  
         } else {  
           result.Add(new ReductionElement(getReductionElement(ix), paramPermutation.Last(), signature));  
         }  
   
         if (ix != flatElements.Count - 1) {  
           result.AddRange(flatElements.GetRange(ix + 1, flatElements.Count - ix - 1));  
         }  
   
         return result;  
       }  
   
       throw new ApplicationException("Got to unreachable code somehow...");  
     }  
   
     private static HashSet<MethodDefinition> HighestPriorityMethods(List<MethodDefinition> candidates, ConsumeDirection preferredDirection) {  
       if (candidates.Count <= 1) {  
         return new HashSet<MethodDefinition>(candidates);  
       }  
   
       HashSet<MethodDefinition> best = new HashSet<MethodDefinition>();  
       best.UnionWith(candidates.Where(def => def.Consumes == Type.ExecutionToken));  
       if (best.Any()) {  
         return best;  
       }  
   
       best.UnionWith(candidates.Where(def => def.Consumes.IsExplicitIdentifier && def.Direction == preferredDirection));  
       if (best.Any()) {  
         return best;  
       }  
   
       best.UnionWith(candidates.Where(def => def.Consumes.IsExplicitIdentifier));  
       if (best.Any()) {  
         return best;  
       }  
   
       var workset = candidates.Where(def => def.Direction == preferredDirection).ToList();  
       if (workset.Count == 0) {  
         workset = candidates;  
       }  
   
       if (workset.Count == 1) {  
         return new HashSet<MethodDefinition>(workset);  
       } else if (workset.Count > 1) {  
         var superBest = workset.Where(candidate => workset.TrueForAll(other => candidate == other || candidate.IsMoreSpecificThan(other)));  
         if (superBest.Any()) {  
           return new HashSet<MethodDefinition>(superBest);  
         } else {  
           return new HashSet<MethodDefinition>(workset);  
         }  
       }  
   
       return new HashSet<MethodDefinition>(candidates);  
     }  
   }  
   
   public static class IEnumerableSignalExt {  
     public static IEnumerable<T> Signal<T>(this IEnumerable<T> collection, Action<T> signal){  
       foreach (var entry in collection) {  
         signal(entry);  
         yield return entry;  
       }  
     }  
   }  
 }  

The yielding allows me to try the best paths first. If we find a solution, we can exist out early. Unfortunately for maintainers (me), there's like 3 different places that can happen. And then we usually end up about 10 steps deep in recursion within this construct.

And the saddest thing of all is that this is actually easier to debug than what was there. It's harder to maintain, and certainly harder to understand. But performance is exponentially better even though this nests loops like it's going out of style. And it will support features in the future like partial consumption of a sub-expression based on user-defined methods.

No comments:

Post a Comment