Pergunta

my Haskell application reads input as a list of ByteString and I'm using Text.Regex.Posix.ByteString.regexec to find matches. Some input has a character code 253 (it's a 1/2 symbol in one IBM PC character set) and it seems that the pattern '.' (i.e., dot, "match any character") doesn't match it. Any way to make it match ?

Foi útil?

Solução

This works for me on a Windows Haskell install:

> length $ ((pack ['\1'..'\253']) =~ "." :: [[ByteString]])
252

I.e. dot matches all characters in range including code 253.

Note that the library calls out to the underlying posix regex matcher, typically, I assume, from glibc.

So I would imagine any issue you have would be with that precise underlying c implementation.

Something like Text.Regex.TDFA.ByteString might give you clearer behavior in this case, since it is all in Haskell?

Outras dicas

That doesn't make sense. Why would you want to match a half-character? . will match the full character.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top