Serial Communication Using RS-232 by Christian Blum
Serial Communication Using RS-232
13 Greenacre
Windsor, SL4 5LW
United Kingdom
http://barleywood.com
This is a summary on serial communication using the TTY protocol. It contains information on the TTY protocol and hardware and software implementations for IBM PCs which has been derived from National Semiconductor data sheets and practical experience of the author and his supporters.
If you want to contribute to this file in any way, please email me. My email address is: chbl@stud.uni-sb.de or chris@phil15.uni-sb.de
Acknowledgements
The following persons have contributed to this summary by providing information or making suggestions/reporting errors:
Madis Kaal <mast@anubis.kbfi.ee>
Steve Poulsen <stevep@ims.com>
Scott C. Sadow <NS16550A@mycro.UUCP>
Dan Norstedt <?>
Alan J. Brumbaugh <brumba@maize.rtsg.mot.com>
Mike Surikov <surikov@adonis.iasnet.com>
Varol Kaptan <E66964%trmetu.bitnet@relay.EU.net>
Richard F. Drushel <rfd@po.CWRU.Edu>
John A. Limpert <johnl@n3dmc.svr.md.us>
Brent Beach <ub359@freenet.victoria.bc.ca>
Torbjoern (sp?) Lindgren <tl@etek.chalmers.se>
Stephen Warner <ee_d316@dcs.kingston.ac.uk>
Kristian Koehntopp <kris@black.toppoint.de>
Angelo Haritsis <ah@doc.ic.ac.uk>
Jim Graham <jim@n5ial.mythical.com>
Ralf Brown <ralf@cs.cmu.edu>
Alfred Arnold <zam036@zam112.zam.kfa-juelich.de>
Andrew M. Langmead <aml@world.std.com>
Richard Clayton <richard@locomotive.com>
Christof Baumgaertner <baumg@rhrk.uni-kl.de>
Goran Bostrom <GORAN@infovox.se>
Brian Mork <bmork@opus-ovh.spk.wa.us>
Introduction
One of the most universal parts of the PC is its serial port. You can connect a mouse, a modem, a printer, a plotter, another PC, dongles, etc. But its usage (both software and hardware) is one of the best-kept secrets for most users, besides that it is not difficult to understand how to connect devices to it and how to program it. This document as a manual for the serial port of your PC for both hardware and software.
Historical summary
In early days of telecommunication, errand-boys and optical signals (flags, lights, clouds of smoke) were the only methods of transmitting information across long distances. With increasing requirements on speed and growing amount of information, more practical methods were developed. One milestone was the first wire-bound transmission on May 24th, 1844 ("What hath God wrought", using the famous Morse alphabet). Well, technology improved a bit, and soon there were machines that could be used like typewriters, except that you typed not only on your own sheet of paper but also on somebody elses. The only thing that has changed on the step from the teletype to your PC regarding serial communications is speed.
The TTY (teletyping) protocol
Þ
A protocol is a clear description of the logical method of transmitting information. This does not include the physical realization.
There is a difference between bits per second and baud (named after J. M. E. Baudot, one of those guys who gave a real push to teletyping): baud means "state changes of the line per second" while bits per second ... well, bits per second means bits per second.
You may find this a bit weird; there's only a difference if the line has more than two states. Since this is not the case with the RS232C (EIA232) port of your PC, most people don't differentiate between baud and bits per second, while I do. For your convenience, I've replaced baud with bps even in copied material without special note. Where you still find baud, it should read bps in most cases (I didn't change labels in source codes, pin names in data sheet information etc.).
To illustrate the difference I give you some figures: 2400 bps at 8n1 carry 1920 bits of information per second, and modems send them at 600 baud through the phone wires, while 1200 bps at 7e1 carry 840 bits of information per second that modems send at 600 baud. I know it's confusing... that's why I quote this from a letter I received from Brent Beach. He explained it more clearly than I did (I've added some information):
"Perhaps a small diagram might help, showing the relationship among the players:
The serial port accepts bytes from the CPU data bus and passes bits to the modem. In doing this, the serial port can add or delete bits, depending on the coding scheme in use.
At (1) we are concerned with bytes per second. At (2) we are concerned with bits per second, and at (3) it's baud. We distinguish because the number of bits at (2) need not be equal to the number of bits (that is, bytes times 8) at (1), and the number of state changes at (3) is not necessarily the same as the number of bits before.
Bits can be stripped going from (1) to (2): the serial port may transmit only 6 or 7 of the 8 bits in the byte. Bits can be added going from (1) to (2): the serial port can add a parity bit and stop bits. From (2) to (3), bits may be clustered to groups that are transmitted using different encoding schemes like Frequency Shift Keying or Quadrature Amplitude Modulation, to name just two.
You can determine the transfer rate in bytes per second depending on the serial port speed and the coding system. For example,
8n1: 1 start bit + 8 data bits + 1 stop bit per byte = 10 bits per byte.
At 2400 bps, this is 240 bytes/characters per second. 2400 bps are normally transmitted using QAM where 4 bits are clustered, and hence encoded to 600 baud.
7e1: 1 start bit, 7 data bits, 1 parity bit, 1 stop bit = 10 bits per byte.
At 1200 bps, this is 120 bytes/characters per second. 1200 bps are encoded using DPSK ('Differential Phase Shift Keying', twobits are clustered), and this results again in 600 baud."
Now let's leave modems for a while and have a look at the serial port itself.
The TTYp uses two different line states called MARK and SPACE. (For the sake of clearness I name the line states HIGH for positive and LOW for negative voltages).
If no data is transmitted, the line is in its quiescent LOW (MARK) state or in the BREAK state (HIGH). Data looks like
Both transmitter (TX) and receiver (RX) use the same data rate (measured in bps, see above), which is the reciprocal value of the smallest time interval between two changes of the line state. TX and RX know about the number of data bits (probably with a parity bit added), and both know about the size of the stop step (called the stop bit or the stop bits, depending on the size of the stop step; normally 1, 1.5 or 2 times the size of a data bit). Data is transmitted bit-synchronously and word-asynchronously, which means that the size of the bits, the length of the words etc., is clearly defined while the time between two words is undefined.
The start bit indicates the beginning of a new data word. It is used to synchronize transmitter and receiver and is always a logical 0 (so the line goes HIGH).
Data is transmitted LSB to MSB, which means that the least significant bit (LSB, Bit 0) is transmitted first with 4 to 7 bits of data following, resulting in 5 to 8 bits of data. A logical 0 is transmitted by the HIGH state of the line, a logical 1 by LOW.
A parity bit can be added to the data bits to allow error detection. There are two (well, actually five) kinds of parity: odd and even (plus none, mark and space). Odd parity means that the number of LOW steps in the data word (including parity bit) is always odd, so the parity bit is set accordingly (I don't have to explain even parity, must I?). It is also possible to set the parity bit to a fixed state or to omit it.
The stop bit does not indicate the end of the word (as it could be derived from its name); it rather separates two consecutive words by putting the line into the LOW state for a minimum time (that means the stop bit is a logical 1).
The protocol is usually described by a sequence of numbers and letters, eg. 8n1 means 1 start bit (always), 8 bits of data, no parity bit, 1 stop bit. 7e2 would indicate 7 bits of data, even parity, 2 stop bits (but I've never seen this one...). The usual thing is 8n1 or 7e1.
Your PC is capable of serial transmission at up to 115,200 bps (step size of 8.68 microseconds!). Typical rates are 300 bps, 1200 bps, 2400 bps and 9600 bps.
This is what John A. Limpert told me about teletypes.
"Real (mechanical) teletypes used 1 start bit, 5 data bits and 1.42 stop bits. Support for 1.5 stop bits in UARTs was a compromise to make the UART timing simpler. Normal speeds were 60 WPM (word per minute), 66 WPM, 75 WPM and 100 WPM. A word was defined as 6.1 characters. The odd stop bit size was a result of the mechanical nature of the machine. It was the time that the printer needed to finish the current character and get ready for the next character. Most teletypes used a 60 mA loop with a 130 V battery. 20 mA loops and lower battery voltages became common when 8 level ASCII teletypes were introduced. The typical ASCII teletype ran at 110 bps with 2 stop bits (11 bits per character)."
It's surely more exact than what I wrote in previous releases. I've just got to add that at least in Germany 50 bps was a familiar speed. And I think the lower voltage he's talking about was 24 volts.
The Physical Transmission
Teletypes use a closed-loop line with a quiescent current of 20 mA and a space current of 0 mA (typically), which allows to detect a broken line. The RS232C port of your PC uses voltages rather than currents to indicate logical states: MARK/LOW is signaled by -3 V to -15 V (typically -12 V) and represents a logical 1. SPACE/HIGH is signaled by +3 V to +15V (typically +12 V) and represents a logical 0. The typical output impedance of the serial port of a PC is 2 kW (resulting in about 5 mA @ 10 V), the typical input impedance is about 4.3 kW , so there should be a maximum fan-out of 5 (5 inputs can be connected to 1 output). Please don't rely on this, it may differ from PC to PC.
Three lines (RX, TX & GND) are at least needed.
Q. Why does my PC have a 25pin/9pin connector if there are only 3 lines needed?
A. There are several status lines that are only used with modems etc. See the Hardware section.
Q. How can I easily connect two PCs by a three-wire lead?
A. This connection is called a null-modem connection. RX1 is connected to TX2 and vice versa, GND1 to GND2. In addition to this, connect RTS to CTS and DCD and connect DTR to DSR (modem software often relies on that). See the Hardware section for further details.
Please be aware that at 115,200 bps (i.e. c.115 kHz, but we need the harmonics up to at least 806 kHz) lines can no longer be regarded as ideal transmission lines. They are low-pass filters and tend to reflect and mutilate the signals, but some metres of twisted wire should always be OK (I use 3m of screened audio cable for file transfer purposes, and it works fine. Not that other kinds of wire wouldn't do; I took what I found). See a good book on transmission lines if you're interested in why long lines can be a problem.
This following was posted to comp.os.msdos.programmer by Andrew M. Langmead:
"The RS-232 spec. has an official limit of 50 ft for RS-232 cables. Realistically they can be much longer. The book Managing UUCP and Usenet by O'Reilly and Associates has a table that they credit to Technical Aspects of Data Comminications by McNamara (Digital Press, 1992). It lists the maximum distances for an RS-232 connection.
Baud Rate |
Max Distance |
Max Distance |
Shielded Cable |
Unshielded Cable |
|
110 |
5000ft |
3000ft |
300 |
5000ft |
3000ft |
1200 |
3000ft |
3000ft |
2400 |
1000ft |
500ft |
4800 |
1000ft |
250ft |
9600 |
250ft |
250ft |
(1 ft » 30 cm)"
Please note that baud is correct in this case, because we're speaking of the transmission line itself.
This is what Torbjoern Lindgren told me:
"I have successfully transmitted at 115,200 with over 30 m long cables! And it wasn't especially good wires. I had some old telecables with 20 individual wires, and used 7 of them for transfer, and left the others unconnected.
I don't remember the exact length, but I know it was something over 30 m, and it probably was closer to 40 m than 30 m. The unused lines probably shielded the lines from each other or something like that. The computers used were two PC-compatibles with off-the-shelf COM-ports. Nothing fancy."
Note that some serial ports are more critical with mutilated signals than others, so you just have to try and find out yourself.
Hardware
Connectors
PCs have 9pin/25pin male SUB-D connectors. The pin layout is as follows (seen from outside your PC):
Name (V24) |
25 pin |
9 pin |
Direction |
Full name |
Remarks |
TxD |
2 |
3 |
Output |
Transmit Data |
|
RxD |
3 |
2 |
Input |
Receive Data |
|
RTS |
4 |
7 |
Output |
Request To Send |
|
CTS |
5 |
8 |
Input |
Clear To Send |
|
DTR |
20 |
4 |
Output |
Data Terminal Ready |
|
DSR |
6 |
6 |
Input |
Data Set Ready |
|
RI |
22 |
9 |
Input |
Ring Indicator |
|
DCD |
8 |
1 |
Input |
Data Carrier Detect |
|
GND |
7 |
5 |
- |
Signal ground |
|
- |
1 |
- |
- |
Protective ground |
Don't use this one for signal ground! |
The most important lines are RxD, TxD, and GND. Others are used with modems, printers and plotters to indicate internal states.
1 (MARK, LOW) means -3 V to -15 V; 0 (SPACE, HIGH) means +3 V to +15 V. On status lines, HIGH is the active state: status lines go to the HIGH state to signal events.
The lines are:
RxD
, TxD: These lines carry the data; 1 is transmitted as LOW and 0 is transmitted as HIGH.
RTS
, CTS: Are used by the PC and the modem/printer/whatsoever (further on referred to as the data set) to start/stop a communication. The PC sets RTS to HIGH, and the data set responds with CTS HIGH. (always in this order). If the data set wants to stop/interrupt the communication (eg. buffer overflow), it drops CTS to LOW; the PC uses RTS to control the data flow.
DTR
, DSR: Are used to establish a connection at the very beginning, i.e. the PC and the data set "shake hands" first to ensure they are both present and active. The PC sets DTR to HIGH, and the data set answers with DSR HIGH. Modems often indicate hang-up by resetting DSR to LOW.
These six lines plus GND are often referred to as the "7-wire connection" or "hand-shake connection."
DCD
: The modem uses this line to indicate that it has detected the carrier of the modem on the other side of the phone line.
RI
: The modem uses this line to signal that "the phone rings" (even if there is neither a bell fitted to your modem nor a phone connected.
GND
: The signal ground, ie. the reference level for all signals.
Protective ground: This line is connected to the power ground of the serial adapter. It should not be used as a signal ground, and it must not be connected to GND (even if your multi-meter shows up an ohmic connection!). Connect this line to the screen of the lead (if there is one). Connecting protective ground on both sides makes sure that no large currents flow through GND in case of an insulation defect on one side (hence the name).
Technical data (typical):
Signal level |
-10.5 V/+11 V |
Short circuit current |
6.8 mA |
Output impedance |
c. 2 kW (non-linear!) |
Input impedance |
c. 4.3 kW (non-linear!) |
Other Asynchronous Hardware
There are several other standards that use the same chipset and protocol as RS232. RS422 and the more robust (but compatible) version RS485 (to name just two) use two wires for every signal. The transmitters can usually be disabled and enabled by software, which makes it possible to use such equipment in a bus system (RX and TX part share the same lines). Despite the possibility to enable and disable the receiver/transmitter section of the port, they are fully compatible to existing RS232 software if a compatible chipset is used.
It's not possible to physically connect these devices eg. RS232 to RS485 without an appropriate interface.
When you connect a data set (eg. a modem), use this connection:
GND1 to GND2
RxD1 to RxD2
TxD1 to TxD2
DTR1 to DTR2
DSR1 to DSR2
RTS1 to RTS2
CTS1 to CTS2
RI1 to RI2
DCD1 to DCD2
In other words, simply connect each pin of the first plug with the corresponding pin of the other. This can easily be done using a 25-wire ribbon cable and two crimp connectors.
When you connect another computer, this is the wiring you need:
GND1 to GND2
RxD1 to TxD2
TxD1 to RxD2
DTR1 to DSR2
DSR1 to DTR2
RTS1 to CTS2
CTS1 to RTS2
If software wants it, connect DCD1 to CTS1 and DCD2 to CTS2.
If hardware handshaking is not needed, a so-called null-modem connection can be used. Connect:
GND1 to GND2
RxD1 to TxD2
TxD1 to RxD2
Additionally, connect (if software needs it):
DTR1 to DSR1
DTR2 to DSR2
RTS1 to CTS1 and DCD1
RTS2 to CTS2 and DCD2
You won't need long wires for these!
The null-modem connection is used to establish an XON/XOFF-connection between two PCs (see the Handshaking section for details about XON/XOFF).
Remember: the names DTR, DSR, CTS and RTS refer to the lines as seen from the PC. This means that for your data set DTR and RTS are incoming signals and DSR and CTS are outputs! Modems, printers, plotters etc. are connected 1:1, ie. pin x to pin x.
Handshaking
The method of exchanging signals for data flow control between computers and data sets is called handshaking. The most popular and most often used handshaking variant is called XON/XOFF; it's done by software, while other methods are hardware-based.
XON/XOFF
Two bytes that are not mapped to normal characters in the ASCII charset are called XON (DC1, Ctrl-Q, ASCII 17) and XOFF (DC3, Ctrl-S, ASCII 19). Whenever either one of the sides wants to interrupt the data flow from the other (eg. full buffers), it sends an XOFF Transmission Off. When its buffers have been purged again, it sends an XON Transmission On to signal that data can be sent again. With some implementations, this can be any character.
XON
/XOFF is of course limited to text transmission. It cannot be used with binary data since binary files tend to contain every single one of the 256 characters. That's why hardware handshaking is normally used with modems, while XON/XOFF is often used with printers and plotters and terminals.
DTR/DSR
The Data Terminal Ready and Data Set Ready signals of the serial port can be used for handshaking purposes, too. Their names express what they do: the computer signals with DTR that it's ready to send and receive data, while the data set sets DSR. With most modems, the meaning of these signals is slightly different: DTR is ignored or causes the modem to hang up if it is dropped, while DSR signals that a connection has been established.
RTS/CTS
While DTR and DSR are mostly used to establish a connection, RTS and CTS have been specially designed for data flow control. The computer signals with RTS Request To Send that it wishes to send data to the data set, while the data set (modem) sets CTS Clear To Send when it's ready to do one part of its job: to send data through the phone wires.
A normal handshaking protocol between a computer and a modem looks like this:
1 |
The computer sets DTR to indicate that it wants to make use of the modem. |
2 |
The modem signals that it is ready and that a connection has been established. |
3 |
The computer requests permission to send. |
4 |
The modem informs the computer that it is now ready to receive data from the computer and send it through the phone wires. |
5 |
The modem drops CTS to signal to the computer that its internal buffers are full; the computer stops sending characters to the modem. |
6 |
The buffers of the modem have been purged, so the computer may continue to send data. |
7 |
This situation is not clear; either the computer's buffers are full and it wants to inform the modem of this, or it doesn't have any more data to be send to the modem. Normally, modems are configured to stop any transmission between the computer and the modem when RTS is dropped. |
8 |
The modem acknowledges RTS by dropping CTS. |
9 |
RTS is again raised by the computer to re-establish data transmission. |
10 |
The modem shows that it is ready to do its job. |
11 |
No more data is to be sent. |
12 |
The modem acknowledges this. |
13 |
DTR is dropped by the computer; this causes most modems to hang up. After hang-up, the modem acknowledges with DSR low. If the connection breaks, the modem also drops DSR to inform the computer about it. |