Name ARB_fragment_program Name Strings GL_ARB_fragment_program Contributors Bob Beretta Pat Brown Matt Craighead Cass Everitt Evan Hart Jon Leech Bill Licea-Kane Bimal Poddar Jeremy Sandmel Jon Paul Schelter Avinash Seetharamaiah Nick Triantos and contributors to the ARB_vertex_program working group, the product of which provided the basis for this spec Contact Benj Lipchak, ATI Research (blipchak 'at' ati.com) IP Status Microsoft claims to own intellectual property related to this extension. Status Complete. Approved by ARB on September 18, 2002 Version Last Modified Date: August 22, 2003 Revision: 26 Number ARB Extension #27 Dependencies The extension is written against the OpenGL 1.3 Specification. OpenGL 1.3 is required. EXT_texture_lod_bias or OpenGL 1.4 is required. OpenGL 1.4 affects the definition of this extension. ARB_vertex_blend and EXT_vertex_weighting affect the definition of this extension. ARB_matrix_palette affects the definition of this extension. ARB_transpose_matrix affects the definition of this extension. EXT_fog_coord affects the definition of this extension. EXT_texture_rectangle affects the definition of this extension. ARB_shadow interacts with this extension. ARB_vertex_program interacts with this extension. ATI_fragment_shader interacts with this extension. NV_fragment_program interacts with this extension. Overview Unextended OpenGL mandates a certain set of configurable per- fragment computations defining texture application, texture environment, color sum, and fog operations. Several extensions have added further per-fragment computations to OpenGL. For example, extensions have defined new texture environment capabilities (ARB_texture_env_add, ARB_texture_env_combine, ARB_texture_env_dot3, ARB_texture_env_crossbar), per-fragment depth comparisons (ARB_depth_texture, ARB_shadow, ARB_shadow_ambient, EXT_shadow_funcs), per-fragment lighting (EXT_fragment_lighting, EXT_light_texture), and environment mapped bump mapping (ATI_envmap_bumpmap). Each such extension adds a small set of relatively inflexible per- fragment computations. This inflexibility is in contrast to the typical flexibility provided by the underlying programmable floating point engines (whether micro-coded fragment engines, DSPs, or CPUs) that are traditionally used to implement OpenGL's texturing computations. The purpose of this extension is to expose to the OpenGL application writer a significant degree of per-fragment programmability for computing fragment parameters. For the purposes of discussing this extension, a fragment program is a sequence of floating-point 4-component vector operations that determines how a set of program parameters (not specific to an individual fragment) and an input set of per-fragment parameters are transformed to a set of per-fragment result parameters. The per-fragment computations for standard OpenGL given a particular set of texture and fog application modes (along with any state for extensions defining per-fragment computations) is, in essence, a fragment program. However, the sequence of operations is defined implicitly by the current OpenGL state settings rather than defined explicitly as a sequence of instructions. This extension provides an explicit mechanism for defining fragment program instruction sequences for application-defined fragment programs. In order to define such fragment programs, this extension defines a fragment programming model including a floating-point 4-component vector instruction set and a relatively large set of floating-point 4-component registers. The extension's fragment programming model is designed for efficient hardware implementation and to support a wide variety of fragment programs. By design, the entire set of existing fragment programs defined by existing OpenGL per-fragment computation extensions can be implemented using the extension's fragment programming model. Issues This extension is closely related to ARB_vertex_program, and is in sync with revision 36 of that spec. ARB_fragment_program will continue to track changes made to ARB_vertex_program. (1) Should we provide precision queries? RESOLVED: We've decided not to include precision queries. Implementations are expected to meet or exceed the precision guidelines set forth in the core GL spec, section 2.1.1, p. 6, as ammended by this extension. To summarize section 2.1.1, the maximum representable magnitude of colors must be at least 2^10, while the maximum representable magnitude of other floating-point values must be at least 2^32. The individual results of floating-point operations must be accurate to about 1 part in 10^5. Here are the reasons why precision queries were not included: 1. It is unclear what the queries should be: a) min, max, [0,1) granularity b) min +, max +, min -, max -, [0,1) granularity c) IEEE mantissa bits, IEEE exponent bits 2. Due to instruction emulation, there is no way to query the actual precision that can be expected. Should the query return the best-case or worst-case precision? 3. Implementations may support multiple precisions, on a per- instruction basis or across the board. How would this be exposed? 4. Current implementations are able to meet the minimum requirements specified in the core GL, thanks to its sufficiently loose wording "... so that the individual results of floating-point operations are accurate to ABOUT 1 part in 10^5." (Emphasis added.) 5. A conformance test can act as watchdog to ensure implementations are not cutting corners on precision. 6. Adding precision queries would require a new entrypoint. See issue 22 regarding reduced-precision modes. (2) Should the LOD biased texture sample be optional? RESOLVED: TXB support is mandatory. This exposes useful functionality which enables blurring and sharpening effects. It will be more useful to entirely override derivatives (scale factor) rather than just biasing the level-of-detail. This would be a future extension to fragment programs. It should be noted here that the bias introduced per-fragment by TXB is added to any per-object or per-stage LOD bias. If per- fragment LOD bias is not necessary, using the per-object and/or per-stage LOD biases may perform better. (3) Should we include the ability to bind to the color matrix? How about others? Program matrices? RESOLVED: We will not specifically add anything that depends on the ARB_imaging subset. So we have not included matrix bindings to the color matrix (or parameter bindings to the color biases, etc.). However, we have included matrix binding support and support for all of the matrices present in ARB_vertex_program. (4) Should we include the ability to bind to just a texcoord attribute's S,T components? (Or just S, or S,T,P for that matter?) RESOLVED: No. Issue #15 below obviates this issue by making the texture coordinate usage within a program explicit, thereby making optimizations to reduce the number of interpolated texture coordinates something an implementation can do at compile time instead of having to do it during every texture target change. (5) What other instructions should be added? Should any be removed? RESOLVED: The differences between the ARB_vertex_program instruction set and the ARB_fragment_program instruction set are minimal. ARB_fragment_program removes the LOG and EXP rough approximation instructions and the ARL address register load instruction. ARB_fragment_program adds the SIN/COS/SCS trigonometric instructions, the LRP linear interpolation instruction, the CMP compare instruction, and the TEX/TXP/TXB/KIL texture instructions. (6) Should depth output be a program option or a mandatory feature? RESOLVED: Depth output capability should be mandatory. (6a) How should per-vertex geometric depth clipping be handled when replacing depth in a fragment program? RESOLVED: Per-vertex geometric depth clipping should be performed by the GL as usual, so no spec change is required. The ideal behavior would be to disable near and far clipping planes when replacing depth, but not all implementations can natively support disabling individual clip planes. (6b) How should depth output from the fragment program be further processed before being handed to the per-fragment operations? RESOLVED: Depth gets clamped by GL to [0,1]. App has access to depth range as a bindable parameter if it wants to either scale and bias its depth to fall within the depth range, or to kill fragments outside the depth range. (7) If a fragment program does not write a color value, what should be the final color of the fragment? RESOLVED: The final fragment color is undefined. Note that it may be perfectly reasonable to have a program that computes depth values but not colors. Fragment colors are often irrelevant if color writes are disabled (via ColorMask). (7a) If a fragment program does not write a depth value, what should be the final depth value of the fragment? RESOLVED: "Depth fly-over" (using the conventional depth produced by rasterization) should happen whenever a depth-replacing program is not in use. A depth-replacing program is defined as a program that writes to result.depth in at least one instruction. The presence of a depth declaration alone DOES NOT designate a depth- replacing program. The intention is that a future extension introducing conditional execution will still consider a program to be depth-replacing even if the instruction(s) writing to result.depth do(es) not execute. Other considered definitions of depth-replacing program: 1. The presence of a depth declaration -OR- the use of result.depth as an instruction destination anywhere in the program designates a depth-replacing program. 2. Every program is a depth-replacing program, but the GL initializes the depth output to be the depth produced by rasterization. The app may then overwrite the depth output. 3. Every program is a depth-replacing program, and the app is solely responsible for copying the depth input to depth output if desired. (8) Should relative addressing, like that defined in ARB_vertex_program, be supported in this spec? RESOLVED: No, relative addressing won't be included in this spec. (9) Should full-featured operand component swizzling, like that defined in ARB_vertex_program, be supported in this spec? RESOLVED: Yes, full swizzling is mandatory. (10) Should texture instructions contain specific limitations on operations that can be performed? For example, should write masks or operand component swizzling be disallowed? RESOLVED: Texture instructions are specified to be very similar to ALU instructions. They have been given 3-letter names, they allow writemasking and saturation (which would be useful for floating- point texture formats), source swizzles and negates, and the ability to use parameters as sources. (11) Should we standardize options for stencil or aux data buffer outputs? RESOLVED: Stencil and aux data buffers will be saved for a possible future extension to fragment programs. (12) Should depth output be pulled from the 3rd or 4th component? RESOLVED: 3rd component, as the 3rd component is also used for depth input from the "fragment.position" attribute. (13) Which stages are subsumed by fragment programs? RESOLVED: Texturing, color sum, and fog. (14) What should the minimum resource limits be? RESOLVED: 10 attributes, 24 parameters, 4 texture indirections, 48 ALU instructions, 24 texture instructions, and 16 temporaries. (15) OpenGL provides a hierarchy of texture enables (cube map, 3D, 2D, 1D). Should the texture sampling instructions here override that hierarchy and select specific texture targets? RESOLVED: Yes. This removes a potential pitfall for developers: leaving the hierarchy of enables in an undesired state. It makes programs more readable as the intent of the sample is more obvious. Finally, it allows compilers to be more aggressive as to which texcoord components are "don't cares" without having to recompile programs when fixed-function texenables change. One drawback is that programs cannot be reused for both 2D and 3D texturing, for example, by simply changing the texture enables. Texture sampling can be specified by instructions like TEX myTexel, fragment.texcoord[1], texture[2], 3D; which would indicate to use texture coordinate set number 1 to sample from the texture object bound to the TEXTURE_3D target on texture image unit 2. Each texture unit can have only one "active" target. Programs are not allowed to reference different texture targets in the same texture image unit. In the example above, any other texture instructions using texture image unit 2 must specify the 3D texture target. Note that every texture image unit always has a texture bound to every texture target, whether it is a named texture object or a default texture. However, the texture may not be complete as defined in section 3.8.9 of the core GL spec. See issue 23. (16) Should aux texture units be additional units on top of existing full-featured texture units, or should this spec fully deprecate "legacy" texture units and only expose texture coordinate sets and texture image units? Background: Some implementations are able to expose more "texture image units" (texture maps and associated parameters) than "texture coordinate sets" (current texcoords, texgen, and texture matrices). A conventional GL "texture unit" encompasses both a texture image unit and a texture coordinate set as well as texture environment state. RESOLVED: Yes, deprecate "legacy" texture units. This is a more flexible model. (17) Should fragment programs affect all fragments, or just those produced by the rasterization of points, lines, and triangles? RESOLVED: Every fragment generated by the GL is subject to fragment program mode. This includes point, line, and polygon primitives as well as pixel rectangles and bitmaps. (18) Should per-fragment position and fogcoord be bindable as fragment attributes? RESOLVED: Yes, interpolated fogcoord will make per-fragment fog application possible, in addition to full fog stage subsummation. Interpolated window position, especially depth, enables interesting depth-replacing algorithms. (19) What characters should be used to identify individual components in swizzle selectors and write masks? RESOLVED: ARB_vertex_program provides "xyzw". This extension supports "xyzw" and also provides "rgba" for better readability when dealing with RGBA color values. Adding support for special identifiers for dealing with texture coordinates was considered and rejected. "strq" could be used to identify texture coordinate components, but the "r" would conflict with the "r" from "rgba". "stpq" would be another possibility, but could be a source of confusion. (20) Should implementations be required to support all programs that fit within the exported limits on the number of resources (e.g., instructions, temporaries) that can be present in a program, even if it means falling back to software? Should implementations be required to reject programs that could never be accelerated? RESOLVED: No and no. An implementation is allowed to fail ProgramStringARB due to the program exceeding native resources. Note that this failure must be invariant with respect to all other OpenGL state. In other words, a program cannot succeed to load with default state, but then fail to load when certain GL state is altered. However, an implementation is not required to fail when a program would exceed native resources, and is in fact encouraged to fallback to a software path. See issue 21 for a way of determining if this has happened. This notable departure from ARB_vertex_program was made as an accommodation to vendors who could not justify implementing a software fallback path which would be relatively slow even compared to an ARB_vertex_program software fallback path. Two issues with this decision: 1. The API limits become hints, and one can no longer tell by visual inspection whether or not a program will load on every implementation. 2. Program loading will now depend on the optimizer, which may vary from release to release of an implementation. A program that succeeded to load when an ISV first wrote it may fail to load in a future driver version, and vice versa. (21) How can applications determine if their programs are too large to run on the native (likely hardware) implementation, and therefore may run with reduced performance? RESOLVED: The following code snippet uses a native resource query to guarantee a program is loaded natively (or not at all): GLboolean ProgramStringIsNative(GLenum target, GLenum format, GLsizei len, const GLvoid *string) { GLint errorPos, isNative; glProgramStringARB(target, format, len, string); glGetIntegerv(GL_PROGRAM_ERROR_POSITION_ARB, &errorPos); glGetProgramivARB(GL_FRAGMENT_PROGRAM_ARB, GL_PROGRAM_UNDER_NATIVE_LIMITS_ARB, &isNative); if ((errorPos == -1) && (isNative == 1)) return GL_TRUE; else return GL_FALSE; } Note that a program that successfully loads, and falls under the native limits, is still not guaranteed to execute in hardware. Lack of other resources (e.g., texture memory) or the use of other OpenGL features not natively supported by the implementation (e.g., textures with borders) may also prevent the program from executing in hardware. (22) Should we provide applications with a method to control the level of precision used to carry out fragment program computations? RESOLVED: Yes. The GL implementation ultimately has control over the level of precision used for fragment program computations. However, the "ARB_precision_hint_fastest" and "ARB_precision_hint_nicest" program options allow applications to guide the GL implementation in its precision selection. The "fastest" option encourages the GL to minimize execution time, with possibly reduced precision. The "nicest" option encourages the GL to maximize precision, with possibly increased execution time. If the precision hint is not "fastest", GL implementations should perform low-precision operations only if they could not appreciably affect the final results of the program. Regardless of the precision hint, GL implementations are discouraged from reducing the precision of computations so aggressively that final rendering results could be seriously compromised due to overflow of intermediate values or insufficient number of mantissa bits. Some implementations may provide only a single level of precision, in which case these hints may have no effect. However, all implementations will accept these options, even if they are silently ignored. More explicit control of precision, such as provided in "C" with data types such as "short", "int", "float", "double", may also be a desirable feature, but this level of detail is left to a separate extension. (23) What is the result of a sample from an incomplete texture? The definition of texture completeness can be found in section 3.8.9 of the core GL spec. RESOLVED: The result of a sample from an incomplete texture is the constant vector (0,0,0,1). The benefit of defining the result to be a constant is that broken apps are guaranteed to generate unexpected (black) results from their bad samples. If we were to leave the result undefined, some implementations may generate expected results some of the time, for example when magfiltering, giving app developers a false sense of correctness in their apps. (24) What is a texture indirection, and how is it counted? RESOLVED: On some implementations, fragment programs that have complex texture dependency chains may not be supported, even if the instruction counts fit within the exported limits. A texture dependency occurs when a texture instruction depends on the result of a previous instruction (ALU or texture) for use as its texture coordinate. A texture indirection can be considered a node in the texture dependency chain. Each node contains a set of texture instructions which execute in parallel, followed by a sequence of ALU instructions. A dependent texture instruction is one that uses a temporary as an input coordinate rather than an attribute or a parameter. A program with no dependent texture instructions (or no texture instructions at all) will have a single node in its texture dependency chain, and thus a single indirection. API-level texture indirections are counted by keeping track of which temporaries are read and written within the current node in the texture dependency chain. When a texture instruction is encountered, an indirection may be added and a new node started if either of the following two conditions is true: 1. the source coordinate of the texture instruction is a temporary that has already been written in the current node, either by a previous texture instruction or ALU instruction; 2. the result of the texture instruction is a temporary that has already been read or written in the current node by an ALU instruction. The texture instruction provoking a new indirection and all subsequent instructions are added to the new node. This process is repeated until the end of the program is encountered. Below is some pseudo-code to describe this: indirections = 1; tempsOutput = 0; aluTemps = 0; while (i = getInst()) { if (i.type == TEX) { if (((i.input.type == TEMP) && (tempsOutput & (1 << i.input.index))) || ((i.op != KILL) && (i.output.type == TEMP) && (aluTemps & (1 << i.output.index)))) { indirections++; tempsOutput = 0; aluTemps = 0; } } else { if (i.input1.type == TEMP) aluTemps |= (1 << i.input1.index); if (i.input2 && i.input2.type == TEMP) aluTemps |= (1 << i.input2.index); if (i.input3 && i.input3.type == TEMP) aluTemps |= (1 << i.input3.index); if (i.output.type == TEMP) aluTemps |= (1 << i.output.index); } if ((i.op != KILL) && (i.output.type == TEMP)) tempsOutput |= (1 << i.output.index); } For example, the following programs would have 1, 2, and 3 texture indirections, respectively: !!ARBfp1.0 # No texture instructions, but always 1 indirection MOV result.color, fragment.color; END !!ARBfp1.0 # A simple dependent texture instruction, 2 indirections TEMP myColor; MUL myColor, fragment.texcoord[0], fragment.texcoord[1]; TEX result.color, myColor, texture[0], 2D; END !!ARBfp1.0 # A more complex example with 3 indirections TEMP myColor1, myColor2; TEX myColor1, fragment.texcoord[0], texture[0], 2D; MUL myColor1, myColor1, myColor1; TEX myColor2, fragment.texcoord[1], texture[1], 2D; # so far we still only have 1 indirection TEX myColor2, myColor1, texture[2], 2D; # This is #2 TEX result.color, myColor2, texture[3], 2D; # And #3 END Note that writemasks for the temporaries written and swizzles for the temporaries read are not taken into consideration when counting indirections. This makes hand-counting of indirections by a developer an easier task. Native texture indirections may be counted differently by an implementation to reflect its exact restrictions, to reflect the true dependencies taking into account writemasks and swizzles, and to reflect optimizations such as instruction reordering. For implementations with no restrictions on the number of indirections, the maximum indirection count will equal the maximum texture instruction count. (25) How can a program reduce SCS's scalar operand to the fundamental period [-PI,PI]? RESOLVED: Unlike the individual SIN and COS instructions, SCS requires that its argument be reduced ahead of time to the fundamental period. The reason SCS doesn't perform this operation automatically is that it may make unnecessary redundant work for programs that already have their operand in the correct range. Other programs that do need to reduce their operand simply need to add a block of code before the SCS instruction: PARAM myParams = { 0.5, -3.14159, 6.28319, 0.15915 }; MAD myOperand.x, myOperand.x, myParams.w, myParams.x; # a = (a/(2*PI))+0.5 FRC myOperand.x, myOperand.x; # a = frac(a) MAD myOperand.x, myOperand.x, myParams.z, myParams.y # a = (a*2*PI)-PI ... SCS myResult, myOperand.x; (26) Is depth output from a fragment program guaranteed to be invariant with respect to depth produced via conventional rasterization? RESOLVED: No. The floating-point representation of depth values output from a fragment program may lead to the output of depth with less precision than the depth output by convention GL rasterization. For example, a floating-point representation with 16 bits of mantissa will certainly produce depth with lesser precision than that of conventional rasterization used in conjunction with a 24-bit depth buffer, where all values are maintained as integers. Be aware of this when mixing conventional GL rendering with fragment program rendering. (27) How can conventional GL fog application be achieved within a fragment program? RESOLVED: Program options have been introduced that allow a program to request fog to be applied to the final clamped fragment color before being passed along to the antialiasing application stage. This makes it easy for: 1. developers to request conventional fog behavior 2. implementations with dedicated fog hardware to use it 3. implementations without dedicated fog hardware, so they need not track fog state after compilation, and constantly recompile when fog state changes. The three mandatory options are ARB_fog_exp, ARB_fog_exp2, and ARB_fog_linear. As these options are mutually exclusive by nature, specifying more than one is not useful. If more than one is specified, the last one encountered in the will be the one to actually modify the execution environment. (28) Why have all of the enums, entrypoints, GLX protocol, and spec language shared with ARB_vertex_program been reproduced here? RESOLVED: The two extensions are independent of one another, in so far as an implementation need not support both of them in order to support one of them. Everything needed to implement or make use of ARB_fragment_program is present in this spec without the need to refer to the ARB_vertex_program spec. When and if these two extensions are incorporated into the core OpenGL, the significant overlap of the two will be collapsed into a single instance of the shared parts. (29) How might an implementation implement the fog options? To What does the extra resource consumption described in 3.11.4.5.1 correspond? RESOLVED: The following code snippets reflect possible implementations of the fog options. While an implementation may use other instruction sequences to achieve the same result, or may use external fog hardware if available, all implementations must enforce the API-level resource consumption as described: 2 params, 1 temp, 1 attribute, and 3, 4, or 2 instructions. "finalColor" in the examples below is the color that would otherwise be "result.color", with components clamped to the range [0,1]. "result.color.a" is assumed to have already been written, as fog blending does not affect the alpha component. EXP: # Exponential fog # f = exp(-d*z) # PARAM p = {DENSITY/LN(2), NOT USED, NOT USED, NOT USED}; PARAM fogColor = state.fog.color; TEMP fogFactor; ATTRIB fogCoord = fragment.fogcoord.x; MUL fogFactor.x, p.x, fogCoord.x; EX2_SAT fogFactor.x, -fogFactor.x; LRP result.color.rgb, fogFactor.x, finalColor, fogColor; EXP2: # # 2nd-order Exponential fog # f = exp(-(d*z)^2) # PARAM p = {DENSITY/SQRT(LN(2)), NOT USED, NOT USED, NOT USED}; PARAM fogColor = state.fog.color; TEMP fogFactor; ATTRIB fogCoord = fragment.fogcoord.x; MUL fogFactor.x, p.x, fogCoord.x; MUL fogFactor.x, fogFactor.x, fogFactor.x; EX2_SAT fogFactor.x, -fogFactor.x; LRP result.color.rgb, fogFactor.x, finalColor, fogColor; LINEAR: # # Linear fog # f = (end-z)/(end-start) # PARAM p = {-1/(END-START), END/(END-START), NOT USED, NOT USED}; PARAM fogColor = state.fog.color; TEMP fogFactor; ATTRIB fogCoord = fragment.fogcoord.x; MAD_SAT fogFactor.x, p.x, fogCoord.x, p.y; LRP result.color.rgb, fogFactor.x, finalColor, fogColor; (30) Why is the order of operands for the CMP instruction different than the order used by another popular graphics API? RESOLVED: No other graphics API was used as a basis for the design of ARB_fragment_program except ARB_vertex_program, which did not have a CMP instruction. This independent evolution naturally led to differences in minor details such as order of operands. This discrepancy is noted here to help developers familiar with the other API to avoid this potential pitfall. (31) Is depth offset applied to the window z value before it enters the fragment program? RESOLVED: As in the base OpenGL specification, the depth offset generated by polygon offset is added during polygon rasterization. The depth value provided to shaders in the fragment.position.z attribute already includes polygon offset, if enabled. If the depth value is replaced by a fragment program, the polygon offset value will NOT be recomputed and added back after fragment program execution. NOTE: This is probably not desirable for fragment programs that modify depth values since the partials used to generate the offset may not match the partials of the computed depth value. New Procedures and Functions void ProgramStringARB(enum target, enum format, sizei len, const void *string); void BindProgramARB(enum target, uint program); void DeleteProgramsARB(sizei n, const uint *programs); void GenProgramsARB(sizei n, uint *programs); void ProgramEnvParameter4dARB(enum target, uint index, double x, double y, double z, double w); void ProgramEnvParameter4dvARB(enum target, uint index, const double *params); void ProgramEnvParameter4fARB(enum target, uint index, float x, float y, float z, float w); void ProgramEnvParameter4fvARB(enum target, uint index, const float *params); void ProgramLocalParameter4dARB(enum target, uint index, double x, double y, double z, double w); void ProgramLocalParameter4dvARB(enum target, uint index, const double *params); void ProgramLocalParameter4fARB(enum target, uint index, float x, float y, float z, float w); void ProgramLocalParameter4fvARB(enum target, uint index, const float *params); void GetProgramEnvParameterdvARB(enum target, uint index, double *params); void GetProgramEnvParameterfvARB(enum target, uint index, float *params); void GetProgramLocalParameterdvARB(enum target, uint index, double *params); void GetProgramLocalParameterfvARB(enum target, uint index, float *params); void GetProgramivARB(enum target, enum pname, int *params); void GetProgramStringARB(enum target, enum pname, void *string); boolean IsProgramARB(uint program); New Tokens Accepted by the parameter of Disable, Enable, and IsEnabled, by the parameter of GetBooleanv, GetIntegerv, GetFloatv, and GetDoublev, and by the parameter of ProgramStringARB, BindProgramARB, ProgramEnvParameter4[df][v]ARB, ProgramLocalParameter4[df][v]ARB, GetProgramEnvParameter[df]vARB, GetProgramLocalParameter[df]vARB, GetProgramivARB and GetProgramStringARB. FRAGMENT_PROGRAM_ARB 0x8804 Accepted by the parameter of ProgramStringARB: PROGRAM_FORMAT_ASCII_ARB 0x8875 Accepted by the parameter of GetProgramivARB: PROGRAM_LENGTH_ARB 0x8627 PROGRAM_FORMAT_ARB 0x8876 PROGRAM_BINDING_ARB 0x8677 PROGRAM_INSTRUCTIONS_ARB 0x88A0 MAX_PROGRAM_INSTRUCTIONS_ARB 0x88A1 PROGRAM_NATIVE_INSTRUCTIONS_ARB 0x88A2 MAX_PROGRAM_NATIVE_INSTRUCTIONS_ARB 0x88A3 PROGRAM_TEMPORARIES_ARB 0x88A4 MAX_PROGRAM_TEMPORARIES_ARB 0x88A5 PROGRAM_NATIVE_TEMPORARIES_ARB 0x88A6 MAX_PROGRAM_NATIVE_TEMPORARIES_ARB 0x88A7 PROGRAM_PARAMETERS_ARB 0x88A8 MAX_PROGRAM_PARAMETERS_ARB 0x88A9 PROGRAM_NATIVE_PARAMETERS_ARB 0x88AA MAX_PROGRAM_NATIVE_PARAMETERS_ARB 0x88AB PROGRAM_ATTRIBS_ARB 0x88AC MAX_PROGRAM_ATTRIBS_ARB 0x88AD PROGRAM_NATIVE_ATTRIBS_ARB 0x88AE MAX_PROGRAM_NATIVE_ATTRIBS_ARB 0x88AF MAX_PROGRAM_LOCAL_PARAMETERS_ARB 0x88B4 MAX_PROGRAM_ENV_PARAMETERS_ARB 0x88B5 PROGRAM_UNDER_NATIVE_LIMITS_ARB 0x88B6 PROGRAM_ALU_INSTRUCTIONS_ARB 0x8805 PROGRAM_TEX_INSTRUCTIONS_ARB 0x8806 PROGRAM_TEX_INDIRECTIONS_ARB 0x8807 PROGRAM_NATIVE_ALU_INSTRUCTIONS_ARB 0x8808 PROGRAM_NATIVE_TEX_INSTRUCTIONS_ARB 0x8809 PROGRAM_NATIVE_TEX_INDIRECTIONS_ARB 0x880A MAX_PROGRAM_ALU_INSTRUCTIONS_ARB 0x880B MAX_PROGRAM_TEX_INSTRUCTIONS_ARB 0x880C MAX_PROGRAM_TEX_INDIRECTIONS_ARB 0x880D MAX_PROGRAM_NATIVE_ALU_INSTRUCTIONS_ARB 0x880E MAX_PROGRAM_NATIVE_TEX_INSTRUCTIONS_ARB 0x880F MAX_PROGRAM_NATIVE_TEX_INDIRECTIONS_ARB 0x8810 Accepted by the parameter of GetProgramStringARB: PROGRAM_STRING_ARB 0x8628 Accepted by the parameter of GetBooleanv, GetIntegerv, GetFloatv, and GetDoublev: PROGRAM_ERROR_POSITION_ARB 0x864B CURRENT_MATRIX_ARB 0x8641 TRANSPOSE_CURRENT_MATRIX_ARB 0x88B7 CURRENT_MATRIX_STACK_DEPTH_ARB 0x8640 MAX_PROGRAM_MATRICES_ARB 0x862F MAX_PROGRAM_MATRIX_STACK_DEPTH_ARB 0x862E MAX_TEXTURE_COORDS_ARB 0x8871 MAX_TEXTURE_IMAGE_UNITS_ARB 0x8872 Accepted by the parameter of GetString: PROGRAM_ERROR_STRING_ARB 0x8874 Accepted by the parameter of MatrixMode: MATRIX0_ARB 0x88C0 MATRIX1_ARB 0x88C1 MATRIX2_ARB 0x88C2 MATRIX3_ARB 0x88C3 MATRIX4_ARB 0x88C4 MATRIX5_ARB 0x88C5 MATRIX6_ARB 0x88C6 MATRIX7_ARB 0x88C7 MATRIX8_ARB 0x88C8 MATRIX9_ARB 0x88C9 MATRIX10_ARB 0x88CA MATRIX11_ARB 0x88CB MATRIX12_ARB 0x88CC MATRIX13_ARB 0x88CD MATRIX14_ARB 0x88CE MATRIX15_ARB 0x88CF MATRIX16_ARB 0x88D0 MATRIX17_ARB 0x88D1 MATRIX18_ARB 0x88D2 MATRIX19_ARB 0x88D3 MATRIX20_ARB 0x88D4 MATRIX21_ARB 0x88D5 MATRIX22_ARB 0x88D6 MATRIX23_ARB 0x88D7 MATRIX24_ARB 0x88D8 MATRIX25_ARB 0x88D9 MATRIX26_ARB 0x88DA MATRIX27_ARB 0x88DB MATRIX28_ARB 0x88DC MATRIX29_ARB 0x88DD MATRIX30_ARB 0x88DE MATRIX31_ARB 0x88DF Additions to Chapter 2 of the OpenGL 1.3 Specification (OpenGL Operation) Modify Section 2.1.1, Floating-Point Computation (p. 6) (modify first paragraph, p. 6) ... The maximum representable magnitude of a floating-point number used to represent position, normal, or texture coordinates must be at least 2^32; the maximum representable magnitude for colors must be at least 2^10. ... Modify Section 2.7, Vertex Specification (p. 19) (modify second paragraph, p. 20) Implementations support more than one set of texture coordinates. The commands void MultiTexCoord{1234}{sifd}(enum texture, T coords); void MultiTexCoord{1234}{sifd}v(enum texture, T coords); take the coordinate set to be modified as the parameter. is a symbolic constant of the form TEXTUREi, indicating that texture coordinate set i is to be modified. The constants obey TEXTUREi = TEXTURE0 + i (i is in the range 0 to k-1, where k is the implementation-dependent number of texture units defined by MAX_TEXTURE_COORDS_ARB). Modify Section 2.8, Vertex Arrays (p. 21) (modify first paragraph, p. 21) ... The client may specify up to 5 plus the value of MAX_TEXTURE_COORDS_ARB arrays: one each to store vertex coordinates... (modify first paragraph, p. 23) The command void ClientActiveTexture(enum texture); is used to select the vertex array client state parameters to be modified by the TexCoordPointer command and the array affected by EnableClientState and DisableClientState with parameter TEXTURE_COORD_ARRAY. This command sets the client state variable CLIENT_ACTIVE_TEXTURE. Each texture coordinate set has a client state vector which is selected when this command is invoked. This state vector includes the vertex array state. This call also selects the texture coordinate set state used for queries of client state. (modify first paragraph, p. 28) If the number of supported texture coordinate sets (the value of MAX_TEXTURE_COORDS_ARB) is k, ... Modify Section 2.10.2, Matrices (p. 31) (modify first paragraph, p. 31) The projection matrix and model-view matrix are set and modified with a variety of commands. The affected matrix is determined by the current matrix mode. The current matrix mode is set with void MatrixMode(enum mode); which takes one of the pre-defined constants TEXTURE, MODELVIEW, COLOR, PROJECTION, or MATRIX_ARB as the argument. In the case of MATRIX_ARB, is an integer between 0 and -1 indicating one of program matrices where is the value of the implementation defined constant MAX_PROGRAM_MATRICES_ARB. Such program matrices are described in section 3.11.7. TEXTURE is described later in section 2.10.2, and COLOR is described in section 3.6.3. If the current matrix mode is MODELVIEW, then matrix operations apply to the model-view matrix; if PROJECTION, then they apply to the projection matrix. (modify first paragraph, p. 34) For each texture coordinate set, a 4x4 matrix is applied to the corresponding texture coordinates... (modify first and second paragraphs, p. 35) The command void ActiveTexture(enum texture); specifies the active texture unit selector, ACTIVE_TEXTURE. Each texture unit contains up to two distinct sub-units: a texture coordinate processing unit (consisting of a texture matrix stack and texture coordinate generation state) and a texture image unit (consisting of all the texture state defined in Section 3.8). In implementations with a different number of supported texture coordinate sets and texture image units, some texture units may consist of only one of the two sub-units. The active texture unit selector specifies the texture coordinate set accessed by commands involving texture coordinate processing. Such commands include those accessing the current matrix stack (if MATRIX_MODE is TEXTURE), TexGen (section 2.10.4), Enable/Disable (if any texture coordinate generation enum is selected), as well as queries of the current texture coordinates and current raster texture coordinates. If the texture coordinate set number corresponding to the current value of ACTIVE_TEXTURE is greater than or equal to the implementation-dependent constant MAX_TEXTURE_COORDS_ARB, the error INVALID_OPERATION is generated by any such command. The active texture unit selector also selects the texture image unit accessed by commands involving texture image processing (section 3.8). Such commands include all variants of TexEnv, TexParameter, and TexImage commands, BindTexture, Enable/Disable for any texture target (e.g., TEXTURE_2D), and queries of all such state. If the texture image unit number corresponding to the current value of ACTIVE_TEXTURE is greater than or equal to the implementation- dependent constant MAX_TEXTURE_IMAGE_UNITS_ARB, the error INVALID_OPERATION is generated by any such command. ActiveTexture generates the error INVALID_ENUM if an invalid is specified. is a symbolic constant of the form TEXTUREi, indicating that texture unit i is to be modified. The constants obey TEXTUREi = TEXTURE0 + i (i is in the range 0 to k-1, where k is the larger of the MAX_TEXTURE_COORDS_ARB and MAX_TEXTURE_IMAGE_UNITS_ARB). For compatibility with old OpenGL specifications, the implementation-dependent constant MAX_TEXTURE_UNITS specifies the number of conventional texture units supported by the implementation. Its value must be no larger than the minimum of MAX_TEXTURE_COORDS_ARB and MAX_TEXTURE_IMAGE_UNITS_ARB. (modify last paragraph, p. 35) The state required to implement transformations consists of a -value integer indicating the current matrix mode (where is 4 + the number of supported texture and program matrices), a stack of at least two 4x4 matrices for each of COLOR, PROJECTION, and TEXTURE with associated stack pointers, stacks (where is at least 8) of at least one 4x4 matrix for each MATRIX_ARB with associated stack pointers, and a stack of at least 32 4x4 matrices with an associated stack pointer for MODELVIEW. Initially, there is only one matrix on each stack, and all matrices are set to the identity. The initial matrix mode is MODELVIEW. The initial value of ACTIVE_TEXTURE is TEXTURE0. Additions to Chapter 3 of the OpenGL 1.3 Specification (Rasterization) Modify Chapter 3, Introduction (p. 58) (modify first paragraph, p. 58) ... Figure 3.1 diagrams the rasterization process. The color value assigned to a fragment is initially determined by the rasterization operations (sections 3.3 through 3.7) and modified by either the execution of the texturing, color sum, and fog operations as defined in sections 3.8, 3.9, and 3.10, or of a fragment program defined in section 3.11. The final depth value is initially determined by the rasterization operations and may be modified or replaced by a fragment program. (modify Figure 3.1) _ +---------------+ FRAGMENT_PROGRAM_ARB /|| Point | enable / | Rasterization |\ | / +---------------+ \ V o-------------+ From / +---------------+ \ | Primitive ---> | Line |---+++--->o o | Assembly \ | Rasterization | / || | | \ +---------------+ / || | | \ +---------------+/ || +-----+-----+ +----+-----+ \|| Polygon | || | Texturing | | Fragment | - | Rasterization | / | +-----+-----+ | Program | +---------------+ / | | +----+-----+ +---------------+ / | +-----+-----+ | | Pixel |/ | | Color Sum | | DrawPixels --> | Rectangle | / +-----+-----+ | | Rasterization | / | V +---------------+ / +-----+-----+ +---------------+ / | Fog |---> Fragments Bitmap ----> | Bitmap |/ +-----------+ | Rasterization | +---------------+ Modify Section 3.3, Points (p. 63) (modify first and second paragraphs, p. 64) All fragments produced in rasterizing a non-antialiased point are assigned the same associated data, which are those of the vertex corresponding to the point. (delete reference to divide by q) If antialiasing is enabled, then ... The data associated with each fragment are otherwise the data associated with the point being rasterized. (delete reference to divide by q) Modify Section 3.4.1, Basic Line Segment Rasterization (p. 66) (modify first paragraph, p. 68) ... (Note that t=0 at p_a and t=1 at p_b). The value of an associated datum f from the fragment center, whether it be R, G, B, or A (in RGBA mode) or a color index (in color index mode) or the s, t, r, or q texture coordinate or the clip w coordinate (the depth value, window z, must be found using equation 3.3, below), is found as f = (1-t)*(f_a/w_a) + t*(f_b/w_b) (3.2) ----------------------------- (1-t)*(1/w_a) + t*(1/w_b) where f_a and f_b are the data associated with the starting and ending endpoints of the segment, respectively; w_a and w_b are the clip w coordinates of the starting and ending endpoints of the segments, respectively. Note that linear interpolation would use f = (1-t)*f_a + t*f_b. (3.3) ... A GL implementation may choose to approximate equation 3.2 with 3.3, but this will normally lead to inacceptable distortion effects when interpolating texture coordinates or clip w coordinates. Modify Section 3.5.1, Basic Polygon Rasterization (p. 73) (modify third and fourth paragraphs, p. 74) Denote a datum at p_a, p_b, or p_c as f_a, f_b, or f_c, respectively. Then the value f of a datum at a fragment produced by rasterizing a triangle is given by f = a*(f_a/w_a) + b*(f_b/w_b) + c*(f_c/w_c) (3.4) --------------------------------------- a*(1/w_a) + b*(1/w_b) + c*(1/w_c) where w_a, w_b, and w_c are the clip w coordinates of p_a, p_b, and p_c, respectively. a, b, and c are the barycentric coordinates of the fragment for which the data are produced. a, b, and c must correspond precisely to the ... at the fragment's center. Just as with line segment rasterization, equation 3.4 may be approximated by f = a*f_a + b*f_b + c*f_c; this may yield ... for texture coordinates or clip w coordinates. Modify Section 3.6.4, Rasterization of Pixel Rectangles (p. 91) (modify third paragraph, p. 103) A fragment arising from a group ... the color and texture coordinates are given by those associated with the current raster position. (delete reference to divide by q) Groups arising from DrawPixels... Modify Section 3.7, Bitmaps (p. 113) (modify third paragraph, p. 114) Otherwise, a rectangular array ... The associated data for each fragment are those associated with the current raster position. (delete reference to divide by q) Once the fragments have been produced ... Modify Section 3.8, Texturing (p. 115) (add new paragraphs before first paragraph, p. 115) Texture coordinate sets are mapped to RGBA colors for application to primitives in one of two modes. The first mode, described in this and subsequent sections, is GL's conventional multitexture pipeline, describing texture environment and texture application. The second mode, referred to as fragment program mode and described in section 3.11, applies textures, color sum, and fog as specified in an application-supplied fragment program. The fragment program mode is enabled and disabled using the generic Enable and Disable commands, respectively, with the symbolic constant FRAGMENT_PROGRAM_ARB. The required state is one bit indicating whether the fragment program mode is enabled or disabled. In the initial state, the fragment program mode is disabled. When fragment program mode is enabled, texturing, color sum, and fog application stages are ignored and a general purpose program is executed instead. (modify first and second paragraph, p. 115) Conventional texturing is employed when fragment program mode is disabled. Texturing maps ... color of an image at the location indicated by a fragment's texture coordinates to modify the fragment's primary RGBA color. Texturing does not affect the secondary color. An implementation may support texturing using more than one image at a time. In this case the fragment carries multiple sets of texture coordinates which are used to index ... (add paragraph before 1st paragraph, p. 116) Except when in fragment program mode (section 3.11), the (s,t,r) texture coordinates used for texturing are the values s/q, t/q, and r/q, respectively, where s, t, r, and q are the texture coordinates associated with the fragment. When in fragment program mode, the (s,t,r) texture coordinates are specified by the program. If q is less than or equal to zero, the results of texturing are undefined. Modify Section 3.8.7, Texture Minification (p. 135) (add new paragraph after first paragraph, p. 137) When fragment program mode is enabled, the derivatives of the coordinates may be ill-defined or non-existent. As a result, the implementation is free to approximate these derivatives with such techniques as differencing. The only requirement is that texture samples be equivalent across the two modes. In other words, the texture sample chosen for a fragment of a primitive must be invariant between fragment program mode and conventional mode subject to the rules set forth in Appendix A, Invariance. Modify Section 3.8.13, Texture Application (p. 149) (modify fourth paragraph, p. 152) Texturing is enabled and disabled individually for each texture unit. If texturing is disabled for one of the units, then the fragment resulting from the previous unit is passed unaltered to the following unit. Individual texture units beyond those specified by MAX_TEXTURE_UNITS may be incomplete and are always treated as disabled. Insert a new Section 3.11, (p. 154), between existing sections 3.10 and 3.11. Renumber 3.11, Antialiasing Application, to 3.12. 3.11 Fragment Programs The conventional GL texturing model described in section 3.8 is a configurable but essentially hard-wired sequence of per-fragment computations based on a canonical set of per-fragment parameters and texturing-related state such as texture images, texture parameters, and texture environment parameters. The general success and utility of the conventional GL texturing model reflects its basic correspondence to the typical texturing requirements of 3D applications. However when the conventional GL texturing model is not sufficient, the fragment program mode provides a substantially more flexible model for generating fragment colors. The fragment program mode permits applications to define their own fragment programs. A fragment program is a character string that specifies a sequence of operations to perform. Fragment program instructions are typically 4-component vector operations that operate on per-fragment attributes and program parameters. Fragment programs execute on a per-fragment basis and operate on each fragment completely independently from any other fragments. Fragment programs execute a finite fixed sequence of instructions with no branching or looping. Fragment programs execute without data hazards so results computed in one instruction can be used immediately afterwards. The result of a fragment program is a set of fragment result registers that becomes the color used by antialiasing application and/or a depth value used in place of the interpolated depth value generated by conventional rasterization. In fragment program mode, the color sum is subsumed by the fragment program. An application desiring the primary and secondary colors to be summed must explicitly include this operation in its program. Fragment programs are defined to operate only in RGBA mode. The results of fragment program execution are undefined if the GL is in color index mode. 3.11.1 Program Objects The GL provides one or more program targets, each identifying a portion of the GL that can be controlled through application- specified programs. The program target for fragment programs is FRAGMENT_PROGRAM_ARB. Each program target has an associated program object, called the current program object. Each program target also has a default program object, which is initially the current program object. Each program object has an associated program string. The command ProgramStringARB(enum target, enum format, sizei len, const void *string); updates the program string for the current program object for . describes the format of the program string, which must currently be PROGRAM_FORMAT_ASCII_ARB. is a pointer to the array of bytes representing the program string being loaded, which need not be null-terminated. The length of the array is given by . If is null-terminated, should not include the terminator. When a program string is loaded, it is interpreted according to syntactic and semantic rules corresponding to the program target specified by . If a program violates the syntactic or semantic restrictions of the program target, ProgramStringARB generates the error INVALID_OPERATION. An implementation may also generate the error INVALID_OPERATION if the program would exceed the native resource limits defined in section 6.1.12. A program which fails to load due to exceeding native resource limits must always fail, regardless of any other GL state. Additionally, ProgramString will update the program error position (PROGRAM_ERROR_POSITION_ARB) and error string (PROGRAM_ERROR_STRING_ARB). If a program fails to load, the value of the program error position is set to the ubyte offset into the specified program string indicating where the first program error was detected. If the program fails to load because of a semantic restriction that is not detected until the program is fully scanned, the error position is set to the value of . If a program loads successfully, the error position is set to the value negative one. The implementation-dependent program error string contains one or more error or warning messages. If a program loads succesfully, the error string may either contain warning messages or be empty. Each program object has an associated array of program local parameters. The number and type of program local parameters is target- and implementation-dependent. For fragment programs, program local parameters are four-component floating-point vectors. The number of vectors is given by the implementation-dependent constant MAX_PROGRAM_LOCAL_PARAMETERS_ARB, which must be at least 24. The commands void ProgramLocalParameter4fARB(enum target, uint index, float x, float y, float z, float w); void ProgramLocalParameter4fvARB(enum target, uint index, const float *params); void ProgramLocalParameter4dARB(enum target, uint index, double x, double y, double z, double w); void ProgramLocalParameter4dvARB(enum target, uint index, const double *params); update the values of the program local parameter numbered belonging to the program object currently bound to . For ProgramLocalParameter4fARB and ProgramLocalParameter4dARB, the four components of the parameter are updated with the values of , , , and , respectively. For ProgramLocalParameter4fvARB and ProgramLocalParameter4dvARB, the four components of the parameter are updated with the array of four values pointed to by . The error INVALID_VALUE is generated if is greater than or equal to the number of program local parameters supported by . Additionally, each program target has an associated array of program environment parameters. Unlike program local parameters, program environment parameters are shared by all program objects of a given target. The number and type of program environment parameters is target- and implementation-dependent. For fragment programs, program environment parameters are four-component floating-point vectors. The number of vectors is given by the implementation- dependent constant MAX_PROGRAM_ENV_PARAMETERS_ARB, which must be at least 24. The commands void ProgramEnvParameter4fARB(enum target, uint index, float x, float y, float z, float w); void ProgramEnvParameter4fvARB(enum target, uint index, const float *params); void ProgramEnvParameter4dARB(enum target, uint index, double x, double y, double z, double w); void ProgramEnvParameter4dvARB(enum target, uint index, const double *params); update the values of the program environment parameter numbered for the given program target . For ProgramEnvParameter4fARB and ProgramEnvParameter4dARB, the four components of the parameter are updated with the values of , , , and , respectively. For ProgramEnvParameter4fvARB and ProgramEnvParameter4dvARB, the four components of the parameter are updated with the array of four values pointed to by . The error INVALID_VALUE is generated if is greater than or equal to the number of program environment parameters supported by . Each program target has a default program object. Additionally, named program objects can be created and operated upon. The name space for program objects is the positive integers and is shared by programs of all targets. The name zero is reserved by the GL. A named program object is created by binding an unused program object name to a valid program target. The binding is effected by calling BindProgramARB(enum target, uint program); with set to the desired program target and set to the unused program name. The resulting program object has a program target given by and is assigned target-specific default values (see section 3.11.8 for fragment programs). BindProgramARB may also be used to bind an existing program object to a program target. If is zero, the default program object for is bound. If is the name of an existing program object whose associated program target is , the named program object is bound. The error INVALID_OPERATION is generated if names an existing program object whose associated program target is anything other than . Programs objects are deleted by calling void DeleteProgramsARB(sizei n, const uint *programs); contains names of programs to be deleted. After a program object is deleted, its name is again unused. If a program object that is bound to any target is deleted, it is as though BindProgramARB is first executed with same target and a of zero. Unused names in are silently ignored, as is the value zero. The command void GenProgramsARB(sizei n, uint *programs); returns currently unused program names in . These names are marked as used, for the purposes of GenProgramsARB only, but objects are created only when they are first bound using BindProgramARB. 3.11.2 Fragment Program Grammar and Semantic Restrictions Fragment program strings are specified as an array of ASCII characters containing the program text. When a fragment program is loaded by a call to ProgramStringARB, the program string is parsed into a set of tokens possibly separated by whitespace. Spaces, tabs, newlines, carriage returns, and comments are considered whitespace. Comments begin with the character "#" and are terminated by a newline, a carriage return, or the end of the program array. The Backus-Naur Form (BNF) grammar below specifies the syntactically valid sequences for fragment programs. The set of valid tokens can be inferred from the grammar. The token "" represents an empty string and is used to indicate optional rules. A program is invalid if it contains any undefined tokens or characters. A fragment program is required to begin with the header string "!!ARBfp1.0", without any preceding whitespace. This string identifies the subsequent program text as a fragment program (version 1.0) that should be parsed according to the following grammar and semantic rules. Program string parsing begins with the character immediately following the header string. ::= "END" ::=