[tutorial text thomashartman1@gmail.com**20081209222916] hunk ./happs-tutorial.cabal 2 -Version: 0.4.3 +Version: 0.4.5 hunk ./happs-tutorial.cabal 47 + FromDataInstances hunk ./templates/home.st 3 -
Haskell is a great way to program.
+$!Haskell is a great way to program.
!$ hunk ./templates/home.st 5 -And HAppS is a great way to build -web applications. +
HAppS is a great way to build web applications. + Besides having a great feature set in its own right, it is probably the leading solution + for implementing web apps in haskell, + my favorite language. hunk ./templates/home.st 10 -
Especially if you believe, like I do, that as +$!
You also get all the goodness +that comes from programming in haskell, my favorite language.
+!$ + +HAppS is especially great if you believe, like I do, that as hunk ./templates/home.st 29 -
-HAppS is haskell's answer to rails and django (and perl's catalyst, and php). -$! , and every ORM ever written in the history of software) !$ -With HAppS, there is no wrangling data -structures into and out of the database, because there is no database. You use whatever data -structures are natural to your application, and serialize them -transparently using -powerful -machinery that's running behind the -scenes. And if there are existing databases that you need to connect to, you can do that too -- you're not locked in to using macid for everything. - -
The above description is a bit idealized. Keeping everything in macid limits you to how much RAM you can afford, - and even if you can afford a lot (16GB in the amazon cloud costs \$576/month) there's no guarantee that you won't - max that out if your application has a lot of data. - (See the stress test chapter for more caveats.) - The HAppS developers have promised a version of HAppS that will make it easy to share ram across computers - with a technique called sharding, but this hasn't been released in a way that inspires confidence in me - (on hackage, sufficient documentation), - and to be honest I don't really understand how it is supposed to work even in theory. - But what is realistic is to write an alpha version of an application without a database access layer, - and then add persistent hard drive storage (probably database, but could also be flat files or name your poison) - outside of macid when it becomes necessary. Most web projects do not get to a size where this is necessary, - so arguably coding in a database from a start is a form of insidious premature optimization, if you buy - the argument that using a database from the start introduces significant maintenance overhead. +
HAppS is haskell's answer to rails and django (and perl's catalyst, + and php). + $! , and every ORM ever written in the history of software) !$ + With HAppS, there is no wrangling data + structures into and out of the database, because there is no database. You use whatever data + structures are natural to your application, and serialize them + transparently using + powerful + machinery that's running behind the + scenes. And if there are existing databases that you need to connect to, you can do that too + -- you're not locked in to using macid for everything. hunk ./templates/home.st 49 -
You also get all the goodness -that comes from programming in haskell, my favorite language.
+ +There are some limitations to using macid + as a datastore that you should familiarize yourself with + if you are looking into using HAppS for heavy-usage transactional applications. + But long term, HAppS with macid looks promising enough that I've started + using it as a platform for building commercial web 2.0 type apps. (My first + commercial happs app will be public soon, so stay tuned on + techcrunch. :) ) hunk ./templates/home.st 64 + addfile ./templates/macidlimits.st hunk ./templates/macidlimits.st 1 - +
Keeping everything in macid limits you to how much RAM you can afford. + Even if you have a business model where you can afford a lot (16GB in the amazon cloud costs \$576/month) + there's no guarantee that you won't + max that out if your application has a lot of data, if you are limited to one computer. + +
(See the stress test chapter for more caveats.) + The HAppS core developers have promised HAppS features + that will make it easy to share application state + across many computers, making scaling to ebay-sized proportions relatively straightforward: you + just add more computers to your amazon EC2 cloud. This hasn't happened yet, and I have a feeling that + when (if?) it does happen it won't be a panacea for every scaling problem. But for reference, + the features are replication and sharding. You can search the happs googlegroup to learn more + about this. + + +
My take is that, as currently implemented, Macid may be impractical for an app with tens of thousands of + concurrent sessions, especially if real acid transactionality is required. (EG, an accounting application.) + The situation improves if you have large numbers of users but don't require transactionality. + (E.G, facebook, reddit, message boards.) + This is true for web apps with a database backend as well, for more or less similar reasons. + +
A realistic way to use HAppS with macid is to write an alpha version of an application + without a database access layer, + and then add persistent hard drive storage (probably database, but could also be flat files or name your poison) + outside of macid only if it becomes necessary. + +
This raises the question: if you are eventually going to have to put in a database back end, why use + macid at all? + + +
The (slightly depressing) truth is that you probably won't have to put in a database back end, + because your app won't be so successful that this is required. If you do need it, you will have + to rewrite a lot of your state code, but it will be worth it, because venture capitalists will be + knocking down your door. (You're the next facebook, ebay, etc.) + +
In the best case scenario, the HAppS core team will deliver on their promise of easy scaling + via ec2 and similar cloud solutions, in which case you won't even have to deal with a state rewrite. + + +
Since macid is available and macid is more straightforward to use than a database layer, + coding in a database from a start is a form of insidious premature optimization, if you buy + the argument that using a database from the start introduces significant maintenance overhead. + +
Having made this lengthy argument for macid... I can see myself rewriting a future tutorial chapter + that describes using HAppS with takusen and postresql as a backend rather than macid. But that hasn't + happened yet. If it needs to, it will. :) hunk ./templates/macidmigration.st 2 + +
What happens when your data model changes? + +
People who have been following this tutorial for a while may have noticed that every time I come out with + a new version, the existing users and jobs disappear and we start out with a blank slate again. + +
That's because changing the data schema of a happs-with-macid web app (aka migrating) is a chore. + +
It's a chore with traditional web apps too. But it's probably more difficult with macid, especially + given the sparse documentation. + +
This isn't a problem for happstutorial.com, because typically there are only a + few dozen users and jobs, plus whatever dummy data I've entered myself. Who cares? + So far, rather than migrating, I've just wiped the slate clean. + +
However, that isn't going to work for your latest facebook-killer. + +
The good news is, there is a way to migrate HAppS state through various iterations, it's sufficiently + documented if you know where to look, and it's not too painful once you've gotten used to it. + +
The main challenge is finding documentation. + +
The best documentation I have found is the migration thread in the happs googlegroup, + along with the tarred migration example here. + My advice is to read through the thread, then untar the migration example and run the examples in order + as described in the thread. + +
Possibly this is sufficient documentation for you to start doing migrations yourself. If so, great, + ignore what follows. + +
If you feel like you could use more guidance, read on for some notes I put together on my own + migration experience. + +
Though I haven't started migration for the toy job board in happs-tutorial, I am using it for my + commercial project that is under development. This will be the basis of the notes that follow. + +
I almost didn't include these notes, because I wanted to provide an + easy step by step example that referenced the toy job board as I have done for other tutorial topics. + However, doing this will be quite a bit of work, I haven't gotten around to doing this after many weeks. + I want to get the information out, so I decided to just share what I have. + +
I apologize in advance if some of this seems confusing or fragmentary. I did my best, and I will + try to clean things up and integrate the example into the tutorial rather than snipping from an external app. + +
Ok... General migration notes (haven't yet applied to happstutorial, though I plan on doing this soon): + +
Old state module names should not change, nor be shifted around in + directory structure (which is really just another kind of name + change). Therefore it's a good idea to start out with a sane + directory hierarchy for schema versions before you start doing + migrations. I recommend keeping App state in one monolithic file, in + a directory devoted to state versions. It makes schema migrations + much easier, as all references to the old state can then be handle + via import Qualified StateLast.hs as Old, then referenced via + Old.whatever, when bits of logic that remain consistent between the + old monolithic state file and the new monolithic state file. Resist + the temptation to split state into multiple files. Bear in mind that due to template haskell the order of + data structures declarations becomes significant, which is usually not + the case with haskell. (I seem to reall this annoyance was part of the + reason why I started splitting things into multiple files to begin + with, which I later regretted because it made migration that much harder.) + +
Don't call HAppS State "State", as this conflicts with the + State datatype in Control.Monad.State. I usually call my state + datatype AppState. + +
There will be code duplication, for the functions that get + transformed to state modifiers in template haskell. This is a bad + code smell, but I think it's unavoidable for the mkMethods directive with + all the methods template haskell needs. + +
In the old state file (being migrated from), make sure it exports + everything via module OldState ( ... everything gets exported here + ...) where.... My way to do this is load the state module, :browse + in ghci, copy the output, and clean it up using emacs regexen. In + emacs, dired-mark-files-regexp and dired-do-query-replace-regexp are + your friends. + +
Then, (my way), cd StateVersions, cp AppState1.hs AppState2.hs (or whatever version number we're on.) + +
Seems almost too obvious to say, but if you have live customer + you're not going to want to migrate this without having tested the + migration in a sandbox first. Create your sandbox, which should + include a snapshot of live customer data. Good way to create a + snapshot is tar -czvf _local on your live data. Test thoroughly on this + snapshot before doing the live migration. And even if you think you've tested enough, + tar snapshot your live data before the migration again, just in case. + +
Make a live data snapshot and copy it to your migration sandbox: _local.tar.gz. (If there is an unwieldy large + amount of data, create a smaller data set by setting up a server identical with the live server and doing some + actions manually.) + +
cd StateVersions; cp AppState1.hs AppState2.hs (or whatever version we're on) + For now, we just want a placeholder that will have AppState2 behaving exactly like AppState1. + change references from AppState1.hs to AppState2.hs in app code. + try running server, the result should be that it compiles, runs, but all data is all lost. (because we haven't written migration yet.) + +
roll back from backup taken earlier: rm -rf _local and tar -xzvf _local.tar.gz (an explicit reminder to + rollback the live data tar may be omitted from future steps, but basically you keep rolling back until + you get a successful migration.) + +
modify AppState (or whatever your main State datastructure is), say, adding a field. Don't write a Migrate instance + yet. Try running. You'll probably get an error like "Exception: Non-exhaustive patterns in case." + Kind of a crappy error message if you ask me, but ok. What's happening is the pre-existing data in the + _local directory isn't compatible with your modified AppState. If you rm -rf _local and try running again, it should + work now. But of course you have lost all your data, and need to rollback the live data again for the next step. + +
Now make necessary changes in code for migration. + See eelco's and my uploads to happs google group (tk happs tutorial). Summary is: + modify the version instance for AppState and add a migrate instance, allong the following lines. First, + +
import qualified StateVersions.AppState1 as Old
+
...
+
-- we'll say 2 because this is StateVersions/AppState2.hs
+
-- I don't think it matters what number you use as long as it's higher than the last version,
+
-- but I'd like to have a core dev confirm that intuition.
+
-- I wonder too what happens if you screw up this version number somehow. EG, what if you specify a version number
+
-- identical to the version you're migrating from?
+
instance Version AppState where
+
mode = extension 2 (Proxy :: Proxy Old.AppState)
+
+
This won't compile, you'll get an error about a missing (Migrate Old.AppState AppState) + instance arising from use of extension. So we supply the instance + +
instance Migrate Old.AppState AppState where
+
migrate (Old.AppState s d) = AppState (migrates s) (migrated d)
+
migrates s = undefined
+
migrated (us, aus, rs, rus) = undefined
+
+
We use undefined just to get it to compile and have something to darcs commit, and then write sensible code later.
+
+
*Main> :! grep -irn AppState1 *.hs
+
Controller.hs:27:import StateVersions.AppState1
+
ControllerAppMigration.hs:18:import StateVersions.AppState1
+
......
+
View.hs:29:import StateVersions.AppState1
+
+
These are the places in the code that need to be switched to use AppState2 instead.
+
+
Let's test this by adding an emails field to UserInfos
+
Actually, first let's try adding an email field to Macid1 and see if we get an error.
+
We do get an error, and it's a weird error:
+
*** Exception: src/Macid1/Repos.hs:45:2-24: Non-exhaustive patterns in case
+
at \$(deriveSerialize ''Repos)
+
+
Is non-exhaustive pattern because somewhere behind the scenes there has been a macid version bump + when it detected that the schema changed? + +
Dunno, but let's try now by switching state to AppState2.hs +
Let's also note the latest checkpoint in the _local directory. It is: ...
+
ls -lth _local/patch-shack_state/ | head -n2
+
... checkpoints-0000000014
+
+
and back it up:
+
tar -czvf _loacl.beforemigration.tar.gz _local
+
+
Step1, cd StateVersions; cp AppState1.hs AppState2.hs. Ok, that works. (Haven't actually used migration machinery yet.) + +
Now, let's try using the migration machinery, but the migrate is actually just id (so no data structure actually changes). + +
The following is a snip from a working migration instance, where one field in an interior data structure +has been added. (Specifically, UserProfile has gone from a 3 argument constructor to a 4 argument constructor). + +
You might think this looks like a lot of boilerplate for adding a single field, and I would agree. The good news +is that your migration code will look similar if you are making more than just that one change, and the problem +is still tractable. + +
And of course, migrations with a database back-end are no picnic either. + +
Migrate instance example (add a field to UserProfile):
+
+
+ instance Migrate Old.AppState AppState where
+
migrate (Old.AppState s d) = AppState (migrates s) (migrated d)
+
+
-- Nothing changed in sessions -- it's the second arg to appstate (AppDatastore users) that had a field added
+
-- We could have avoided writing migrates by using type synonyms to exactly copy the types from AppState1,
+
-- as is done in eelco's example.
+
-- I prefer to write out the migration explicitly rather than use type synonyms, because then after a successful
+
-- migration the Migrate instance and the old state code can be removed, and you wind up with just a monolithic
+
-- state file evolving over time rather than a sequence of states each with a module dependency on the previous state
+
-- . (I think this is the case -- still have to prove this works.)
+
migrates :: Old.Sessions Old.SessionData -> Sessions SessionData
+
migrates (Old.Sessions s) = Sessions . M.map f \$ s
+
where f :: Old.SessionData -> SessionData
+
f (Old.UserSession (Old.UserName u) ) = UserSession (UserName u)
+
f (Old.AdminSession (Old.AdminUserName u) ) = AdminSession (AdminUserName u)
+
+
migrated (Old.Users us, Old.AdminUsers aus, Old.Repos rs, Old.RepoUsers rus) =
+
( (Users . M.map ui . M.mapKeys uk \$ us),
+
(AdminUsers . M.map auv . M.mapKeys auk \$ aus),
+
(Repos . M.map rv . M.mapKeys rk \$ rs),
+
(RepoUsers . IXS.fromSet . S.map ru . IXS.toSet \$ rus )
+
)
+
where ui (Old.UserInfos p (Old.UserProfile c bl av ) ) = UserInfos p (UserProfile S.empty c bl av)
+
uk (Old.UserName u) = UserName u
+
auk (Old.AdminUserName u) = AdminUserName u
+
auv (Old.AdminUserInfos p) = AdminUserInfos p
+
rv (Old.Repo (Old.UserName u) bud blu isp) = Repo (UserName u) bud blu isp
+
rk (Old.RepoName n) = RepoName n
+
ru (Old.RepoUser (Old.RepoName n) (Old.UserName u) ) = RepoUser (RepoName n) (UserName u)
+
The most important lesson I learned is that putting all application state into macid won't work if you +
Sidenote: as (I think) 37 Signals said, "you don't have a scaling problem." I certainly don't have a scaling problem, + and if I did I would probably be jumping for joy even while I was tearing my hair out trying to figure out + how to accomodate more users. My current plans are for a membership site with paying user count, optimistically, + in the high thousands -- so I'm okay using macid even with all the caveats. I'm mainly working on the + user scaling problem because I find it interesting and am learning a lot; secondarily because I have ideas + in the works that might actually run into the RAM limit. + +
That said, the most important lesson I learned is that putting all application state into macid won't work if you hunk ./templates/macidstresstest.st 99 - -$!
To be honest, maybe not. !$ - -$! -
I have done some preliminary testing to answer this question, and so far the results have been disappointing. - -
I am hoping that I am doing something wrong, and that there is a way of using macid effectively - for more than just toy applications. I also asked the HAppS googlegroup for help, and if there is a - solution for the problems I found I will definitely be sharing it in the tutorial, so stay tuned. - -
I am seeking feedback from HAppS experts and educated users on the following questions: -
Thanks in advance for anybody who can help me push HAppS to 11! - -$! - -
Still chugging... At 1:03, about 15 minutes after we started, the 200th - user is inserted. The jobs page loads slowly, but that's to be expected with - a 20000 jobs long pagination. I ctrk-c out, and restart. The state file for the last experiment - is 542M large. The jobs app gives the startup message ("exit :q ghci completely...") - but it seems to be starting very slowly, and gives no feedback why. - Also emacs is sluggish. I restarted at 1:10, it's 1:14 and localhost:5001 still doesn't show anything. - At 1:20 http://localhost:5001 loads normally. - - - -!$ - +
Obviously this chapter of the happs tutorial is in progress and, to some extent, in flux. There is a + thread about the stress tests in the haspps googlegroup that you can have a look at + for even more information. I also say a few more words about what macid isn't good for in + macid limits. hunk ./templates/missinghappsdocumentation.st 15 - +(But cc the happs googlegroup as well! :) )