PHP Coding Standards ==================== This file lists several standards that any programmer, adding or changing code in PHP, should follow. Since this file was added at a very late stage of the development of PHP v3.0, the code base does not (yet) fully follow it, but it's going in that general direction. Since we are now well into the version 4 releases, many sections have been recoded to use these rules. Code Implementation ------------------- [0] Document your code in source files and the manual. [tm] [1] Functions that are given pointers to resources should not free them For instance, function int mail(char *to, char *from) should NOT free to and/or from. Exceptions: - The function's designated behavior is freeing that resource. E.g. efree() - The function is given a boolean argument, that controls whether or not the function may free its arguments (if true - the function must free its arguments, if false - it must not) - Low-level parser routines, that are tightly integrated with the token cache and the bison code for minimum memory copying overhead. [2] Functions that are tightly integrated with other functions within the same module, and rely on each other non-trivial behavior, should be documented as such and declared 'static'. They should be avoided if possible. [3] Use definitions and macros whenever possible, so that constants have meaningful names and can be easily manipulated. The only exceptions to this rule are 0 and 1, when used as false and true (respectively). Any other use of a numeric constant to specify different behavior or actions should be done through a #define. [4] When writing functions that deal with strings, be sure to remember that PHP holds the length property of each string, and that it shouldn't be calculated with strlen(). Write your functions in a such a way so that they'll take advantage of the length property, both for efficiency and in order for them to be binary-safe. Functions that change strings and obtain their new lengths while doing so, should return that new length, so it doesn't have to be recalculated with strlen() (e.g. php_addslashes()) [5] NEVER USE strncat(). If you're absolutely sure you know what you're doing, check its man page again, and only then, consider using it, and even then, try avoiding it. [6] Use PHP_* macros in the PHP source, and ZEND_* macros in the Zend part of the source. Although the PHP_* macro's are mostly aliased to the ZEND_* macros it gives a better understanding on what kind of macro you're calling. [7] When commenting out code using a #if statement, do NOT use 0 only. Instead use "_0". For example, #if FOO_0, where FOO is your cvs user foo. This allows easier tracking of why code was commented out, especially in bundled libraries. [8] Do not define functions that are not available. For instance, if a library is missing a function, do not define the PHP version of the function, and do not raise a run-time error about the function not existing. End users should use function_exists() to test for the existence of a function [9] Prefer emalloc(), efree(), estrdup(), etc. to their standard C library counterparts. These functions implement an internal "safety-net" mechanism that ensures the deallocation of any unfreed memory at the end of a request. They also provide useful allocation and overflow information while running in debug mode. In almost all cases, memory returned to the engine must be allocated using emalloc(). The use of malloc() should be limited to cases where a third-party library may need to control or free the memory, or when the memory in question needs to survive between multiple requests. Naming Conventions ------------------ [1] Function names for user-level functions should be enclosed with in the PHP_FUNCTION() macro. They should be in lowercase, with words underscore delimited, with care taken to minimize the letter count. Abbreviations should not be used when they greatly decrease the readability of the function name itself. Good: 'mcrypt_enc_self_test' 'mysql_list_fields' Ok: 'mcrypt_module_get_algo_supported_key_sizes' (could be 'mcrypt_mod_get_algo_sup_key_sizes'?) 'get_html_translation_table' (could be 'html_get_trans_table'?) Bad: 'hw_GetObjectByQueryCollObj' 'pg_setclientencoding' 'jf_n_s_i' [2] If they are part of a "parent set" of functions, that parent should be included in the user function name, and should be clearly related to the parent program or function family. This should be in the form of parent_*. A family of 'foo' functions, for example: Good: 'foo_select_bar' 'foo_insert_baz' 'foo_delete_baz' Bad: 'fooselect_bar' 'fooinsertbaz' 'delete_foo_baz' [3] Function names used by user functions should be prefixed with "_php_", and followed by a word or an underscore-delimited list of words, in lowercase letters, that describes the function. If applicable, they should be declared 'static'. [4] Variable names must be meaningful. One letter variable names must be avoided, except for places where the variable has no real meaning or a trivial meaning (e.g. for (i=0; i<100; i++) ...). [5] Variable names should be in lowercase. Use underscores to separate between words. [6] Method names follow the 'studlyCaps' (also referred to as 'bumpy case' or 'camel caps') naming convention, with care taken to minimize the letter count. The initial letter of the name is lowercase, and each letter that starts a new 'word' is capitalized. Good: 'connect()' 'getData()' 'buildSomeWidget()' Bad: 'get_Data()' 'buildsomewidget' 'getI()' [7] Classes should be given descriptive names. Avoid using abbreviations where possible. Each word in the class name should start with a capital letter, with words underscore delimited. The class name should be prefixed with the name of the 'parent set'. Good: 'Curl' 'Foo_Bar' Bad: 'foobar' 'foo_bar' 'FooBar' Syntax and indentation ---------------------- [1] Never use C++ style comments (i.e. // comment). Always use C-style comments instead. PHP is written in C, and is aimed at compiling under any ANSI-C compliant compiler. Even though many compilers accept C++-style comments in C code, you have to ensure that your code would compile with other compilers as well. The only exception to this rule is code that is Win32-specific, because the Win32 port is MS-Visual C++ specific, and this compiler is known to accept C++-style comments in C code. [2] Use K&R-style. Of course, we can't and don't want to force anybody to use a style he or she is not used to, but, at the very least, when you write code that goes into the core of PHP or one of its standard modules, please maintain the K&R style. This applies to just about everything, starting with indentation and comment styles and up to function declaration syntax. (see also http://www.catb.org/~esr/jargon/html/I/indent-style.html) [3] Be generous with whitespace and braces. Always prefer: if (foo) { bar; } to: if(foo)bar; Keep one empty line between the variable declaration section and the statements in a block, as well as between logical statement groups in a block. Maintain at least one empty line between two functions, preferably two. [4] When indenting, use the tab character. A tab is expected to represent four spaces. It is important to maintain consistency in indenture so that definitions, comments, and control structures line up correctly. Documentation and Folding Hooks ------------------------------- In order to make sure that the online documentation stays in line with the code, each user-level function should have its user-level function prototype before it along with a brief one-line description of what the function does. It would look like this: /* {{{ proto int abs(int number) Returns the absolute value of the number */ PHP_FUNCTION(abs) { ... } /* }}} */ The {{{ symbols are the default folding symbols for the folding mode in Emacs and vim (set fdm=marker). Folding is very useful when dealing with large files because you can scroll through the file quickly and just unfold the function you wish to work on. The }}} at the end of each function marks the end of the fold, and should be on a separate line. The "proto" keyword there is just a helper for the doc/genfuncsummary script which generates a full function summary. Having this keyword in front of the function prototypes allows us to put folds elsewhere in the code without messing up the function summary. Optional arguments are written like this: /* {{{ proto object imap_header(int stream_id, int msg_no [, int from_length [, int subject_length [, string default_host]]]) Returns a header object with the defined parameters */ And yes, please keep the prototype on a single line, even if that line is massive. New and Experimental Functions ----------------------------------- To reduce the problems normally associated with the first public implementation of a new set of functions, it has been suggested that the first implementation include a file labeled 'EXPERIMENTAL' in the function directory, and that the functions follow the standard prefixing conventions during their initial implementation. The file labelled 'EXPERIMENTAL' should include the following information: Any authoring information (known bugs, future directions of the module). Ongoing status notes which may not be appropriate for CVS comments. Aliases & Legacy Documentation ----------------------------------- You may also have some deprecated aliases with close to duplicate names, for example, somedb_select_result and somedb_selectresult. For documentation purposes, these will only be documented by the most current name, with the aliases listed in the documentation for the parent function. For ease of reference, user-functions with completely different names, that alias to the same function (such as highlight_file and show_source), will be separately documented. The proto should still be included, describing which function is aliased. Backwards compatible functions and names should be maintained as long as the code can be reasonably be kept as part of the codebase. See /phpdoc/README for more information on documentation.