Uses a few heuristics to eliminate quite a bit of this gibberish, but an improved splitter class would probably bring indexing much closer to the desired asymptotic behavior.
英
美
- 使用某些启发式方法消除了部分此类乱码,但是改进的splitter类将会使索引更接近恶性状况。