Reading assignment:
I have a short description of the Baum-Welch algorithm, with an example, that the lecture notes follow. You can download the complete description here.
Durbin et al section 3.3 and Ewens and Grant, section 12.2.3 in particular present complementary ways of thinking about the Baum-Welch expectation maximization problem.
None of our books really cover gene prediction by HMMs in sufficient detail, so I'll refer you to the GENSCAN. In addition, there is a fairly new paper from David Haussler on a program called shortHMM, that you might be interested in. It seems to be one of the best programs around right now (at least that's what the authors say) and the paper gives a nice little introduction to the field.