Programming Praxies – Finding Digit Strings In Powers Of Two

Today’s boredom lead me to solving another programming praxies problem:

Search every power of two below 210000 and return the index of the first power of two in which a target string appears. For instance, if the target is 42, the correct answer is 19 because 219 = 524288, in which the target 42 appears as the third and fourth digits, and no smaller power of two contains the string 42.

The naive solution is simple: keep doubling a number (starting at 1) until you find a sequence that matches the target. Of course once you get 2^64 storing the number as a primitive type will not suffice, where you would need to come up with a solution to store larger numbers.

I  improved on the naive approach by caching the number sequences for each exponent. For the cache I used a type of suffix tree:

private class SuffixTreeNode {
	public short MinExponent;
	public SuffixTreeNode[] Children;
}

Unlike your traditional suffix tree, this one does not compress string sequences. It’s really just a trie, where the number in each sequence is stored implicitly as the index in Children (i.e. these arrays are of length 10). However the data structure is populated and access like a suffix tree: where each suffix of a number sequence is inserted into the tree. Each node in the tree is annotated with the smallest exponent that the sequence can be found in.

Here is the full code:

public static class DigitStringPowerTwoSearch
{
	private static SuffixTreeNode suffixRoot;
	private static short currentExponent;
	private static List<int> currentDigits;

	static DigitStringPowerTwoSearch () {
		FlushCache ();
	}

	internal static void FlushCache() {
		// Exposed as internal really for unit tests only
		currentExponent = 0;
		currentDigits = new List<int> { 1 };
		suffixRoot = new SuffixTreeNode ();
		AddDigitsToTree (currentDigits, 0, 0);
	}

	public static int FindMinBaseTwoExponent(ulong target) {
		var targetDigits = ToDigitArray(target);
		while (currentExponent <= 10000) {
			short exponent = FindMinExponentInTree(targetDigits);
			if (exponent >= 0)
				return exponent;
			AddNextExponentToTree();
		}
		throw new ArgumentOutOfRangeException ("target", 
		                                       "target's digits do not exist in a base two number with exponent " +
		                                       "less or equal to 10,000");
	}

	private static void AddNextExponentToTree() {
		DoubleDigits (currentDigits);
		currentExponent++;

		for (int i = 0; i < currentDigits.Count; i++) {
			AddDigitsToTree (currentDigits, i, currentExponent);
		}
	}

	private static void AddDigitsToTree(IList<int> digits, int startIndex, short exponent) {
		SuffixTreeNode current = suffixRoot;
		for (int i = startIndex; i < digits.Count; i++) {
			int digit = digits [i];
			if (current.Children == null) {
				current.Children = new SuffixTreeNode[10];
			}
			if (current.Children [digit] == null) {
				var newNode = new SuffixTreeNode { MinExponent = exponent };
				current.Children [digit] = newNode;
				current = newNode;
			} else {
				current = current.Children [digit];
				// Here we assume exponent is always the largest exponent,
				// so no need to check/update current.MinExponent
			}
		}
	}

	private static short FindMinExponentInTree(int[] targetDigits) {
		SuffixTreeNode current = suffixRoot;
		foreach(var digit in targetDigits) {
			if (current == null || current.Children == null)
				return -1;
			current = current.Children[digit];
		}
		if (current == null)
			return -1;
		return current.MinExponent;
	}

	private static int[] ToDigitArray(ulong n) {
		if (n == 0) 
			return new int[] { 0 };

		var digits = new List<int>();

		for (; n != 0; n /= 10)
			digits.Add((int)(n % 10));

		var arr = digits.ToArray();
		Array.Reverse(arr);
		return arr;
	}

	private static void DoubleDigits(List<int> digits) {
		bool carry = false;
		for (int i = digits.Count - 1; i >= 0; i--) {
			int d = digits [i] * 2;
			if (carry)
				d++;
			if (d >= 10) {
				d -= 10;
				carry = true;
			} else {
				carry = false;
			}
			digits [i] = d;
		}
		if (carry)
			digits.Insert (0, 1);
	}

	private class SuffixTreeNode {
		public short MinExponent;
		public SuffixTreeNode[] Children;
	}
}

When the function is invoked, if a sequence of numbers have been seen before, then we get O(N) performance (where N = amount of digits in target). Effectively the cache becomes a hash-trie.

A reflection on code reflection: where it helps, and where it hinders.

Today at work I broke some of my team project’s unit tests from a seemingly harmless code change (C#). I simply changed a protected member into an auto-property. Unfortunately the code change was bundled with other changes, for which any other innocent coder would have thought be the changes to blame. But it was the small, innocent (almost code-cosmetic), change that was carried out using a click of the button with Resharper. The test blew up because there was an important piece of code (somewhere) that used reflection to search the class for non-public instance (visible to the class type) member variables and collate members that inherited a specific interface. What is this!? Some silent protocol?? How fragile!

Image

Reflection is a powerful tool that has blessed us with many awesome easy to use API’s. But clearly, it is not suitable to solve all problems. So when should we use reflection? And when should we avoid it? Here are a few common pros and cons that tend to crop up around the topic of reflection (this is not by all means a comprehensive list!):

Good reflection 🙂

  • Dependency injection frameworks.
    Reflection has given us killer IoC tools like Windsor and Unity to solve our dependency problems.
    Clearly refection is a key enabler in the technology, as dependencies and instaintiation is all achievable via binary metadata analysis.
  • Plugin frameworks.
    Plugins frameworks commonly use reflection to dynamically load 3rd party plugins, which it could not do so easily without dynamically loading the additional libraries via reflection.

Bad reflection 😦

  • Refactoring tools and code analysis tools are incompatible.
    The opening example of this post shows that refactoring tools cannot cover what reflection can do: it can make your code brittle. It’s much better to be explicit with your code design; avoid establishing implicit protocols in your code base which your reflection code requires in order to work correctly. Note that static code analysis tools such as refactoring tools, or features like discovering method usage (e.g. with the reshaper tool) are rendered useless with code that uses reflection. This is a dangerous place to be.
  • Adds a super-generic layer of indirection.
    Indirection is a double edged sword: it can improve the design (and yield a number of benefits), but with the cost of adding complexity. The problem with reflection is that it adds a higher degree of indirection than non-reflective code, because it hides static detail such as class names, method names, and property names.  So heavy use of reflection makes the program near impossible to run through static code walk throughs. It also can be very difficult to debug. 
  • Run-time errors instead of compile-time errors.
    This argument can be used for all sorts of mechanisms (such as dynamic type-checking features), but it is a good point to make.  If you have the option of a design that doesn’t require reflection, at least you have a chance your compiler will complain if code changes have broken something. A design using reflection is subject to runtime errors, which in the worst case may not be detected until a release cycle (or in production!).
  • Invocation via reflection is much slower.
    Generally the performance hit from reflection is neglectable, but in sections of code where reflection is used heavily performance will degrade. Performance is much slower in reflection because during invocation the binaries metadata must be inspected at runtime (rather than being precompiled at compile time).

Conclusion

Avoid reflection.

If you think you need to solve a problem by reflection, rework the design (don’t be lazy!). Also don’t use reflection to get at protected data (e.g. non-public members), violating standard language conventions will get you into all sorts of trouble further down the line. Only use reflection where it is absolutely the only way to meet your needs. An acceptable place to use reflection is where there would be no way – at least without enforcing a difficult/cumbersome protocol to adhere to – to implement a solution. So be wary of the drawbacks of reflection before you get crazy with it, and always strive for a solid design!