Why is this Haskell program so much slower than an equivalent Python one?

read is slow. For bulk parsing, use bytestring or text primitives, or attoparsec.

I did some benchmarking. Your original version ran in 23,9 secs on my computer. The version below ran in 0.35 secs:

import qualified Data.ByteString.Char8 as Bimport Control.Applicativeimport Data.Maybeimport Data.Listimport Data.Charmain = print . sum =<< getIntListgetIntList :: IO [Int]getIntList =    map (fst . fromJust . B.readInt) . B.words <$> B.readFile "test.txt"

By specializing the parser to your test.txt file, I could get the runtime down to 0.26 sec:

getIntList :: IO [Int]          getIntList =    unfoldr (B.readInt . B.dropWhile (==' ')) <$> B.readFile "test.txt"

python performance haskell io

Read is slow

Fast read, from this answer, will bring you down to 5.5 seconds.

import NumericfastRead :: String -> IntfastRead s = case readDec s of [(n, "")] -> n

Strings are Linked Lists

In Haskell the String type is a linked list. Using a packed representation (bytestring if you really only want ascii but Text is also very fast and supports unicode). As shown in this answer, the performance should then be neck and neck.

python performance haskell io

I would venture to guess that a big part of your problem is actually words. When you map read . words, what you're actually doing is this:

Scan the input looking for a space, building a list of non-spaces as you go. There are a lot of different kinds of spaces, and checking any character that's not a common type of space additionally involves a foreign call to a C function (slow). I'm planning to fix this sometime, but I haven't gotten around to it yet, and even then you'll still be building and throwing away lists for no good reason, and checking for spaces when you really just want to check for digits.
Read through the list of accumulated characters to try to make a number out of them. Produce the number. The accumulated list now becomes garbage.
Go back to step 1.

This is a fairly ridiculous way to proceed. I believe you can even do better using something horrible like reads, but it would make more sense to use something like ReadP. You can also try fancier sorts of things like stream-based parsing; I don't know if that will help much or not.

CodeHunter

Why is this Haskell program so much slower than an equivalent Python one?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last