Your tokenizer is behaving correctly. New
and York
are two different tokens. What you want to do is something called chunking. Here is some information about chunking to give you some background.
Depending on which NLP library you are using, there is probably some functionality built in for chunking. For OpenNLP, which you included in your question tags, see this related question: How to extract the noun phrases using Open nlp's chunking parser