In a cloud computing environment like Hadoop it is not reasonable to have any centralized resources. In the co-occurrence analyzing phase eos supports a simple dictionary based entity recognition. The dictionary is represented bei a Patrica trie in the main memory of the process. This representation is optimized in resource consumption. In many domains, like the biomedical domain, dictionaries are very large and unable to hold in the main memory. To adress this problem create the co-occurrence analyzing phase more pipelineing to support large dictionaries.