|
|||||||||||
|
linux-ipsec: syntax conventions, take 2
From: Henry Spencer <henry%spenford(at)zoo.toronto.edu>
Date: Wed Mar 25 1998 - 13:00:14 EST
IPSEC UI syntax Henry Spencer v1.1, 25 March 1998 This is an attempt to set down some rules for how our interfaces look and behave. I don't expect that everything will conform to it immediately, although I hope we can keep the transition period short. Special cases may call for exceptions to the rules, but there should be a good reason. This document is not final; comments and criticisms are welcome. Principles I've tried to follow a few basic principles here. Related items of data should be grouped together where this is useful. Syntaxes using shell metacharacters should be avoided, to simplify use of cut-and-paste interfaces. For the same reason, input and output syntax should be identical. Having precisely the optimal syntax in any particular context is not nearly so important as keeping the syntax consistent so that people don't have to learn separate rules for each context. Basic Syntax Numbers follow the familiar C syntax. Octal values, should we have cause to use any, are always written with a leading zero. Decimal values never have a leading zero. Hexadecimal values always have a leading 0x. Input recognizes alphabetic hex digits (and the x in the 0x prefix) in either case, but output is always in lowercase. Attempts to guess the base of an input value are to be avoided, even when it is clear from context (e.g., keys are almost invariably hex), because then the user has to remember which contexts make guesses and which don't. Insisting that the user always specify his intentions explicitly is better.
Precisely which base is used for output depends on circumstances. Values
which are bit fields, like key data, should generally be in hex. Most
other values -- SPIs, lifetimes, port numbers, etc. -- should be decimal.
Codes (enumerations selecting, e.g., encryption algorithms) and flags
Flags fields with more than one bit set are written with the flag names separated by plus (+). Flag order is not significant on input, and on output it should be consistent in all places where a given set of flag bits is being output. In the absence of any other specific preference, consider working from right to left, because new flags tend to be added on the left and it's best if new names appear on the end in the output. Code/flag names should always contain either a hyphen or at least one non-hexadecimal letter, to avoid any chance of confusion. Attempts to allow ad-hoc shortening of names on input by guessing the expansion are to be avoided, because when new names are added and old short versions thus become ambiguous, the guesser has a choice between making errors and rejecting some formerly-valid inputs (which may appear in scripts), and neither is satisfactory. Desirable abbreviations should be defined as separate names. Abbreviations for flag bits might include multiple bits, to allow common combinations to be set conveniently, but output should always use the individual bit names. Code/flag names are case-insensitive on input. Output should always use lowercase.
It might be useful to have a /proc/net/ipsec/names or something like that,
so that applications can read the kernel's code/flag names and their
definitions rather than having to have an independent compiled-in list.
IPv4 Addresses Etc. A single IPv4 address can be written on input as a DNS name, a dotted-quad address (exactly four decimal numbers, without leading zeros, separated by single dots), or a single hex number (with leading 0x or 0X, and no dots). Output should always be in dotted-quad format. Parsing software should be aware that DNS names can begin with a digit, and that there are even some existing DNS names with all-numeric components, e.g. "1776.com". Our parsing algorithm is as follows. If it has a leading 0x or 0X, treat as hex. If not, scan the input for a non-digit non-dot character, and if one is found, treat as a DNS name. Otherwise treat as a dotted quad. It is acceptable to reject inputs which confuse this algorithm. In all cases, check the syntax to be sure that it fits the specified choice. Note that a DNS name may have a trailing dot. Note also that a dotted quad must have all four numbers, because of inconsistent historical conventions on how to treat incomplete dotted quads (most people think "128.100.72" means 128.100.72.0, but 4BSD inet_addr() thinks it means 128.100.0.72). An IPv4 subnet is written as a special case of a tuple (see later): a network (see below) and a mask (see below) separated by a slash (/). Note that either part of this can be in any of the acceptable forms, e.g. one could have a DNS name as network and a dotted quad as mask. Because not everybody puts "pure" network addresses (host bits zero) into their DNS entries, it's legitimate for some of the 0 bits in the mask to be 1 in the network on input; they should be zeroed out in the internal form. An IPv4 network syntactically is like an address, with the addition that it can be an incomplete dotted quad, which is treated as being zero-padded on the right. (NOTE CAREFULLY: neither inet_addr() nor inet_network() does this properly. Given 128.100.72, inet_addr() interprets it as 128.100.0.72, while inet_network() interprets it as 0.128.100.72. Sigh.) An IPv4 subnet mask syntactically is like an address, with the addition that if it is written as a single decimal number, that stands for a 32-bit value with that many high-order bits on. This notation for masks is strongly preferred; the others are supported for historical reasons. Input should reject a mask which cannot be written this way, since it is almost certainly a typo or logic error.
(Implementation note: in unsigned arithmetic, x = ~mask, ((x|(x-1))+1)&x
masklen = 0; if (mask&0x00000001) masklen |= 0x20; if (mask&(0x0000ffff<<1)) /* <<1 for 1-origin numbering */ masklen |= 0x10; if (mask&(0x00ff00ff<<1)) masklen |= 0x08; if (mask&(0x0f0f0f0f<<1)) masklen |= 0x04; if (mask&(0x33333333<<1)) masklen |= 0x02; if (mask&(0x55555555<<1)) masklen |= 0x01; This is simpler than, and nearly as fast as, a binary search.) Output of a subnet should use dotted-quad notation for the network and decimal-mask-length notation for the mask. Host bits in the network should be zeroed out (by ANDing it with the mask) if this has not been done already internally, and only round_up(mask_length mod 8) components of the dotted quad should be shown, to keep it as compact as possible. So, for example, 128.100.0.0/255.255.0.0 should appear as 128.100/16 on output.
(Probably we should have library routines for this stuff, so everything
There is no special notation for a range of IPv4 addresses. Should it be necessary to write one, a tuple (see below) containing the two addresses is probably the best way; should it be necessary to distinguish this from a subnet, context (e.g. an option) is necessary.
Port numbers, protocol numbers, etc. may be input either as numbers
Tuples It is sometimes desirable to group individual values into syntactic entities that can be manipulated as a whole. An obvious example is grouping a destination address and an SPI together to identify an SA. A tuple is formed out of values by separating them by slash (/) or newline. (Why would you want to use newline? Read on...) An empty value within a tuple (a / following another / or at the end of the tuple) indicates the use of a default value, determined by context and not necessarily zero. A tuple that is shorter than necessary indicates that some of its values have been omitted, in a context-dependent way; keep the rules simple, please, and beware of what lurks in the next paragraph. If a command argument (etc.) that would be taken as a tuple is written with a leading "/" or "./", that means that it is a filename, and the real tuple is the contents of the file. This is provided so that sensitive information like keys (but also IVs, setup options, etc.) can be read by a route which is not open to inspection by "ps" and similar programs. Note that this is why newlines are accepted as value delimiters within tuples, so that bulky information can be split over multiple lines in such a file. Also, when such a file is read, empty lines and lines starting with "#" should be discarded. Magic Values It's sometimes necessary to give special "magic" values in contexts where an ordinary value would otherwise appear. Such magic values should have names, and should be input and output by name, even if they are internally represented with a special numeric value like 0 or -1. Names should contain at least one character which is not a hex digit. On input they are caseinsensitive; on output they are in lowercase. Given the special significance of "*" to the shells, "any" should be used when a way to write a wildcard value is needed. The IPSEC drafts use "opaque" in some places to refer to values that are hidden by encryption. While "opaque" should be accepted in such contexts, "hidden" is a better way of expressing the concept, and is our preferred way of writing this. (Remember, in the long run, people using user interfaces probably will not have read the RFCs.) It's awkward to provide such named magic values in contexts where a DNS name might appear, since they might be valid local names. This is probably best handled as a special case (context is necessary to sort out names vs. subnets vs. ranges anyway). Note, a subnet wildcard can be written as "0/0". Argument Parsing Command arguments should be parsed with the GNU getopt_long(), to permit standard option syntax and (sigh) long option names. Positional arguments after the options are okay, but keep the number down, e.g. by grouping them into tuples. All commands, without exception, must implement "--help" and "--version". When either of these two options is encountered in argument parsing, the command should immediately output the requested information on standard output, abandon all further argument parsing, and exit with a "successful" exit status. "--help" should output brief documentation on how the program is invoked. A good guideline is that it should be *at most* 20 lines and preferably no more than 10. "--version" should output brief information on the program's name and version number (which, for minor utilities tied to our IPSEC package, should match the package's version number). The first line, in particular, should be of the form "ppppp (Linux FreeS/WAN) vvvvv" where ppppp is the program's name (not its argv[0], which may vary for irrelevant reasons, but its official name) and vvvvv is the version number. The first line is probably all that's really necessary, but if you want to include more, see the GNU programming standards (found in pub/gnu/standards/standards.text on the usual GNU archive machines) for ideas.
(Implementation note: the release engineer will arrange for any relevant
static char ipsec_version[] = "%s (Linux FreeS/WAN) vvvvv"; where vvvvv will be filled in suitably and updated with each release. The intent, obviously, is that this be used as a printf string with the program name supplied as an argument. It's unfortunate that this file can't just be in a single central place, but I fear it's probably necessary to put it into the individual directories, given that part of our source tree gets relocated into the kernel sources.) SA Identifiers
An IPSEC SA is identified by a tuple containing the destination address
Technically there are two separate SPI number spaces, one for AH and one
for ESP. Our current implementation has a single unified space, which is
fine for incoming SAs (where we are the destination) since we allocate
SPIs for them, but will need caution for outbound SAs. There are not
many situations where we need to look up an outbound SA by identifier,
and it's probably simplest to keep the AH/ESP distinction a separate
parameter, but that's a guess and this decision may need to be revisited
Note that our current SA table actually is a muddled combination of the SAD and SPD. I think we're going to have to separate the two eventually. That's why the above definition makes no mention of subnets and ranges; an IPSEC SA is identified by a full, single IP destination address, not by a pattern for same. Patterns appear only in the SPD. Examples I haven't finished sorting out the application of this to our current commands, but here's a quick first cut. This assumes unification into a single command with subcommands, which I think is a good idea, to minimize the chance of name collisions. Currently we have: fir# addrt 10.2.0.143 255.255.255.255 10.2.0.139 255.255.255.255 \ 10.2.0.139 125 and that might become: fir# ipsec addtunnel 10.2.0.143/32 10.2.0.139 --spi 125
(That is, tunnel a specified subnet to a given destination, with an
fir# setsa 10.2.0.139 125 esp 3des-md5-96 i \ 1000000000000001 6630663066303132 might become: fir# ipsec addsa esp 10.2.0.139/125 3des-md5-96 \ 1000000000000001/6630663066303132/i
(where all information needed to set up the algorithm is collected into
Received on Wed Mar 25 14:26:44 1998 This archive was generated by hypermail 2.1.8 : Wed Aug 23 2006 - 12:59:15 EDT |
||||||||||
|
|||||||||||