Haskell is not without its faults. One of the most universally acknowledged annoyances, even for pros, is keeping track of the . There are, in total, five different types representing strings in Haskell. Remember Haskell is strongly typed. So if we want to represent strings in different ways, we have to have different types for them. This motivates the need for these five types, all with slightly different use cases. It’s not so bad when you’re using any one of them. But when you constantly have to convert back and forth between them, it can be a major hassle. In this article we’ll go over these five different types. We’ll examine their different use cases, and observe how to convert between them. different string types Strings The type is the most basic form of representing strings in Haskell. It is a simple type synonym for a (the type). So whenever you see in your compile errors, know this refers to the basic type. By default, when you enter in a string literal in your Haskell code, the compiler infers it as a . String list of unicode characters Char [Char] String String myFirstString :: StringmyFirstString = “Hello” The list representation of strings gives us some useful behavior. Since a string is actually a list, we can use all kinds of familiar functions from Data.List. >> let a = “Hello”>> map Data.Char.toUpper a“HELLO”>> ‘x’ : a“xHello”>> a ++ “ Person!”“Hello Person!”>> Data.List.sort “dbca”“abcd” The main drawback of the vanilla string type is its . This comes as a consequence of immutability. For instance suppose we have: inefficiency myFirstString :: StringmyFirstString = “Hello” myModifiedString :: StringmyModifiedString = map toLower (sort (myFirstString ++ “ Person!”)) This will allocate a total of 4 strings. The first is . The second is the “ Person!” literal. We can then append these without making another string. But then the sorted version will be a third allocation, and the fourth will be the lowercased version. This constant allocation can make our code non-performant and . myFirstString inappropriate for heavy duty operations Text The family of string types solves this dilemma. There are two types: strict and lazy. Most often, you will use the strict form. However, the lazy form is useful in certain circumstances where you know you won’t need the full string. Text Text The main advantage has is that its functions are subject to “fusion”. This means the compiler can actually of multiple allocations we saw in the last example. For instance, if we look at this: Text prevent the issue import qualified Data.Char as Cimport qualified Data.Text as ToptimizedTextVersion :: T.TextoptimizedTextVersion = T.cons ‘c’ (T.map C.toLower (T.append (T.Text “Hello “) (T.Text “ Person!”))) This will only actually allocate a single object at runtime. This will make it it than the version. So for industrial use of heavy text processing, you are much better off using the type than the type. Text substantially more efficient String Text String ByteString The third family of types fall into the category. As with , there are strict and lazy variants of bytestrings. Lazy bytestrings are a bit more common than lazy text values though. Bytestrings are the of the characters. It is the closest you can get to the real machine level interpretation of them. At their core, bytestrings are a list of objects. A is simply an 8-bit number representing a unicode character. ByteString Text lowest level representation Word8 Word8 Most networking libraries will use bytestrings, as they make the most sense for . When you send information across platforms, you can’t be sure about the encoding on the other end. If you store information in a database, you will often want to use bytestrings as well. Like types, they are generally far more efficient than strings. serialization Text Conversion So with all these types floating around, the real problem is converting between them. It can be enormously frustrating when you want to write some basic code but you have a different string type. You’ll have to look up the conversion if you don’t remember, and this can be annoying. Our first example will be with and . This is quite straightforward. The package exports these two functions, which do exactly what you want: String Text Data.Text pack :: String -> Textunpack :: Text -> String There are equivalents in . We’ll find similar functions for going between and . They exist in the package: Data.Text.Lazy ByteStrings Strings Data.ByteString.Char8 pack :: String -> ByteStringunpack :: ByteString -> String Note these only work with strict . To convert between strict and lazy, you'll want functions from the version of a text type. For instance, exports: ByteStrings .Lazy Data.Text.Lazy toStrict :: Data.Text.Lazy.Text -> Data.Text.Text -- (Lazy to strict)fromStrict :: Data.Text.Text -> Data.Text.Lazy.Text -- (Strict to lazy) There are equivalents in . The final conversion we’ll go over is between and . You could use as an intermediate type with the functions above. But this makes certain assumptions about the encoding and is subject to failure. Going from to is straightforward, assuming . The following functions exist in : Data.ByteString.Lazy Text ByteString String Text ByteString you know your data format Data.Text.Encoding encodeUtf8 :: Text -> ByteString -- LE = Little Endian format, BE = Big Endian encodeUtf16LE :: Text -> ByteStringencodeUtf16BE :: Text -> ByteStringencodeUtf32LE :: Text -> ByteStringencodeUtf32BE :: Text -> ByteString In general, you’ll use UTF8 encoded text and thus . Decoding is a little more complicated. Simple functions exist in this same library: encodeUtf8 decodeUtf8 :: ByteString -> TextdecodeUtf16LE :: ByteString -> TextdecodeUtf16BE :: ByteString -> TextdecodeUtf32LE :: ByteString -> TextdecodeUtf32BE :: ByteString -> Text But these can throw errors if your bytestring does not match the format. Run-time exceptions are bad, so for UTF8, we have this function: decodeUtf8’ :: ByteString -> Either UnicodeException Text Which let’s us wrap this in an and handle possible errors. For the other formats, we have to rely on functions like: Either decodeUtf16LEWith :: OnDecodeError -> ByteString -> Text Where is a specific type of handler. These functions can be particularly cumbersome and difficult to deal with. Luckily, you’ll most often be using UTF8. OnDecodeError OverloadedStrings So we haven’t touched too much on language extensions yet in my articles. But here’s our first real example of one. It’s intended to show you ! As we saw earlier, Haskell will in general interpret string literals in your code as the type. This means you are unable to have the following code: language extensions aren’t particularly scary String -- FailsmyText :: TextmyText = “Hello” myBytestring :: ByteStringmyBytestring = “Hello” The compiler expects both of these values to be of type, and not the types you gave. So this will normally throw compiler errors. However, with the extension, you can fix this! Extensions use tags like . They are generally added at the top of your source file. String OverloadedStrings {-# LANGUAGE … #-} {-# LANGUAGE OverloadedStrings #-}-- This works!myText :: TextmyText = “Hello” myBytestring :: ByteStringmyBytestring = “Hello” In fact, for any type you make, you can create an instance of the typeclass. This will allow you to use string literals to represent it. IsString {-# LANGUAGE OverloadedStrings #-} import qualified Data.String (IsString(..)) data MyType = MyType String instance IsString MyType where fromString s = MyType s myTypeAsString :: MyTypemyTypeAsString = “Hello” You can also enable this extension within GHCI. You need to use the command . :set -XOverloadedStrings Summary Haskell uses 5 different types for representing strings. Two of these are lazy versions. The type is a type synonym for a list of characters, and is generally inefficient. represents strings somewhat differently, and can fuse operations together for efficiency. is a low level representation most suited to serialization. There are a lot of ways to convert between types, and it's hard to keep them straight. Finally, the compiler can make your life easier. It allows you to use a string literal refer to any of your different string types. String Text ByteString OverloadedStrings If you haven’t had a chance to get started with Haskell yet, you should get our . It will guide you through installation and some basics! Getting Started Checklist Be sure to check back next week! Now that we understand strings, we’ll divide into another potentially thorny system of different types. We’ll investigate the different numeric types and the simple conversions we can run between them. is how hackers start their afternoons. We’re a part of the family. We are now and happy to opportunities. Hacker Noon @AMI accepting submissions discuss advertising & sponsorship To learn more, , , or simply, read our about page like/message us on Facebook tweet/DM @HackerNoon. If you enjoyed this story, we recommend reading our and . Until next time, don’t take the realities of the world for granted! latest tech stories trending tech stories