Home of the original IBM PC emulator for browsers.
The following document is from the Microsoft Programmer’s Library 1.3 CD-ROM.
Microsoft C - Advanced Programming Techniques
────────────────────────────────────────────────────────────────────────────
Microsoft (R) C - Advanced Programming Techniques
FOR MS (R) OS/2 AND MS-DOS (R)
OPERATING SYSTEMS
────────────────────────────────────────────────────────────────────────────
MICROSOFT CORPORATION
Information in this document is subject to change without notice and does
not represent a commitment on the part of Microsoft Corporation. The
software described in this document is furnished under a license agreement
or nondisclosure agreement. The software may be used or copied only in
accordance with the terms of the agreement. It is against the law to copy
the software on any medium except as specifically allowed in the license or
nondisclosure agreement. No part of this manual may be reproduced or trans-
mitted in any form or by any means, electronic or mechanical, including
photocopying and recording, for any purpose without the express written
permission of Microsoft.
(C) Copyright Microsoft Corporation, 1990. All rights reserved.
Printed and bound in the United States of America.
Microsoft, MS, MS-DOS, CodeView, InPort, and XENIX are
registered trademarks and Windows is a trademark of Microsoft Corporation.
Apple and Macintosh are registered trademarks and Finder
is a trademark of Apple Computer, Inc.
AT&T is a registered trademark of American Telephone
and Telegraph Company.
Hercules is a registered trademark and InColor is a trademark
of Hercules Computer Technology.
IBM is a registered trademark of International Business
Machines Corporation.
Intel is a registered trademark of Intel Corporation.
Olivetti is a registered trademark of Ing. C. Olivetti.
PDP-11 and VAX-11 are registered trademarks of Digital
Equipment Corporation.
WANG is a registered trademark of Wang Laboratories.
Z8000 is a registered trademark of Zilog, Inc.
Document No. LN06514-1189 OEMO711-6Z
10 9 8 7 6 5 4 3 2 1
Table of Contents
────────────────────────────────────────────────────────────────────────────
Introduction
Scope of This Book
Document Conventions
PART I Improving Program Performance
────────────────────────────────────────────────────────────────────────────
Chapter 1 Optimizing C Programs
1.1 Controlling Optimization from the Programmer's WorkBench
1.2 Controlling Optimization from the Command Line
1.3 Controlling Optimization with Pragmas
1.4 Default Optimization
1.4.1 Common Subexpression Elimination
1.4.2 Dead-Store Elimination
1.4.3 Constant Propagation
1.5 Customizing Your Optimizations
1.5.1 Choosing Speed or Size (/Ot and /Os)
1.5.2 Generating Intrinsic Functions (/Oi)
1.5.3 Assuming No Aliasing (/Oa and /Ow)
1.5.4 Performing Loop Optimizations (/Ol)
1.5.5 Disabling Unsafe Loop Optimizations (/On)
1.5.6 Enabling Aggressive Optimizations (/Oz)
1.5.7 Removing Stack Probes (/Gs)
1.5.8 Enabling Global Register Allocation (/Oe)
1.5.9 Enabling Common Subexpression Optimization (/Oc and
/Og)
1.5.10 Achieving Consistent Floating-Point Results (/Op)
1.5.11 Using the 80186, 80188, or 80286 Processor (/G0, /G1,
/G2)
1.5.12 Optimizing for Maximum Efficiency (/Ox)
1.6 Linker (LINK) Options that Control Optimization
1.6.1 Enabling Far Call Optimization (/FARCALLTRANSLATION)
1.6.2 Packing Code (/PACKCODE)
1.6.3 Packing Data (/PACKDATA)
1.6.4 Packing the Executable File (/EXEPACK)
1.7 Optimizing in Different Environments
1.7.1 Optimizing in DOS
1.7.2 Optimizing in OS/2
1.7.3 Optimizing in Microsoft Windows(tm)
1.8 Choosing Function-Calling Conventions
1.8.1 The C Calling Convention (/Gd)
1.8.2 The FORTRAN/Pascal Calling Convention (/Gc)
1.8.3 The Register Calling Convention (/Gr)
1.8.4 The _fastcall Calling Convention
Chapter 2 Managing Memory
2.1 Pointer Sizes
2.1.1 Pointers and 64K Segments
2.1.2 Near Pointers
2.1.3 Far Pointers
2.1.4 Huge Pointers
2.1.5 Based Addressing
2.2 Selecting a Standard Memory Model
2.2.1 The Six Standard Memory Models
2.2.2 Limitations on Code Size and Data Size
2.2.3 The Tiny Memory Model
2.2.4 The Huge Memory Model
2.2.5 Null Pointers
2.2.6 Specifying a Memory Model
2.3 Mixing Memory Models
2.3.1 Pointer Problems
2.3.2 Declaring Near, Far, Huge, and Based Variables
2.3.3 Declaring Near and Far Functions
2.3.4 Pointer Conversions
2.4 Customizing Memory Models
2.4.1 Setting a Size for Code Pointers
2.4.2 Setting a Size for Data Pointers
2.4.3 Setting Up Segments
2.4.4 Library Support for Customized Memory Models
2.4.5 Setting the Data Threshold
2.4.6 Naming Modules and Segments
Chapter 3 Using the In-Line Assembler
3.1 Advantages of In-Line Assembly
3.2 The _asm Keyword
3.3 Using Assembly Language in _asm Blocks
3.4 Using C in _asm Blocks
3.4.1 Using Operators
3.4.2 Using C Symbols
3.4.3 Accessing C Data
3.4.4 Writing Functions
3.5 Using and Preserving Registers
3.6 Jumping to Labels
3.7 Calling C Functions
3.8 Defining _asm Blocks as C Macros
3.9 Optimizing
Chapter 4 Controlling Floating-Point Math Operations
4.1 Declaring Floating-Point Types
4.1.1 Declaring Variables as Floating-Point Types
4.1.2 Declaring Functions that Return Floating-Point Types
4.2 C Run-Time Library Support of Type long double
4.3 Summary of Math Packages
4.3.1 Emulator Package
4.3.2 Math Coprocessor Package
4.3.3 Alternate Math Package
4.4 Selecting Floating-Point Options (/FP)
4.4.1 In-Line Emulator Option (/FPi)
4.4.2 In-Line Math Coprocessor Instructions Option (/FPi87)
4.4.3 Calls to Emulator Option (/FPc)
4.4.4 Calls to Math Coprocessor Option (/FPc87)
4.4.5 Use Alternate Math Option (/FPa)
4.5 Library Considerations for Floating-Point Options
4.5.1 Using One Standard Library for Linking
4.5.2 In-Line Instructions or Calls
4.6 Compatibility between Floating-Point Options
4.7 Using the NO87 Environment Variable
4.8 Incompatibility Issues
PART II Improving Programmer Productivity
────────────────────────────────────────────────────────────────────────────
Chapter 5 Compiling and Linking Quickly
5.1 Compiling Quickly
5.1.1 Quick Compiler
5.1.2 Incremental Compile Option
5.2 Linking Quickly with ILINK
5.2.1 Preparing for Incremental Linking
5.2.2 Incremental Violations
Chapter 6 Managing Development Projects with NMAKE
6.1 Overview of NMAKE
6.2 The NMAKE Command
6.3 NMAKE Description Files
6.3.1 Description Blocks
6.3.2 Comments
6.3.3 Macros
6.3.4 Inference Rules
6.3.5 Directives
6.3.6 Pseudotargets
6.3.7 PWB's extmake Syntax
6.4 Command-Line Options
6.5 NMAKE Command Files
6.6 The TOOLS.INI File
6.7 In-Line Files
6.8 NMAKE Operations Sequence
6.9 Differences between NMAKE and MAKE
Chapter 7 Creating Help Files with HELPMAKE
7.1 Structure and Contents of a Help Database
7.1.1 Contents of a Help File
7.1.2 Help File Formats
7.2 Invoking HELPMAKE
7.3 HELPMAKE Options
7.3.1 Options for Encoding
7.3.2 Options for Decoding
7.4 Creating a Help Database
7.5 Help Text Conventions
7.5.1 Structure of the Help Text File
7.5.2 Local Contexts
7.5.3 Context Prefixes
7.5.4 Hyperlinks
7.6 Using Help Database Formats
7.6.1 QuickHelp Format
7.6.2 Minimally Formatted ASCII Format
7.6.3 Rich Text Format (RTF)
Chapter 8 Customizing the Microsoft Programmer's WorkBench
8.1 Setting Switches
8.1.1 Editing the <assign> Pseudofile
8.1.2 Editing the TOOLS.INI Initialization File
8.2 Assigning Keystrokes
8.3 Writing Macros
8.3.1 Macro Syntax
8.3.2 Macro Responses
8.3.3 Macro Arguments
8.3.4 Macro Conditionals
8.3.5 Temporary Macros
8.3.6 Macro Recordings
8.4 Writing and Building C Extensions
8.4.1 Building Real-Mode Extensions
8.4.2 Building Protected-Mode Extensions
8.4.3 Describing Functions and Switches
8.4.4 Initializing Functions
8.4.5 Prototyping Functions
8.4.6 Receiving Parameters
8.4.7 Calling PWB Functions
8.4.8 Calling C Library Functions
Chapter 9 Debugging C Programs with CodeView
9.1 Understanding CodeView Windows
9.2 Overview of Debugging Techniques
9.3 Viewing and Modifying Program Data
9.3.1 Displaying Variables in the Watch Window
9.3.2 Displaying Expressions in the Watch Window
9.3.3 Displaying Arrays and Structures
9.3.4 Displaying Array Elements Dynamically
9.3.5 Using Quick Watch
9.3.6 Displaying Memory
9.3.7 Displaying the Processor Registers
9.3.8 Modifying the Values of Variables, Registers,
and Memory
9.4 Controlling Execution
9.4.1 Continuous Execution
9.4.2 Single-Stepping
9.5 Replaying a Debug Session
9.6 Advanced CodeView Techniques
9.7 Controlling CodeView with Command-Line Options
9.8 Customizing CodeView with the TOOLS.INI FILE
PART III Special Environments
────────────────────────────────────────────────────────────────────────────
Chapter 10 Communicating with Graphics
10.1 Video Modes
10.1.1 Sample Low-Level Graphics Program
10.1.2 Setting a Video Mode
10.1.3 Reading the videoconfig Structure
10.1.4 Maximizing Resolution or Color
10.1.5 Selecting Your Own Video Modes
10.2 Mixing Colors and Changing Palettes
10.2.1 CGA Palettes
10.2.2 Olivetti(R) Palettes
10.2.3 VGA Palettes
10.2.4 MCGA Palettes
10.2.5 EGA Palettes
10.2.6 Symbolic Constants
10.3 Specifying Points within Coordinate Systems
10.3.1 Physical Coordinates
10.3.2 Viewport Coordinates
10.3.3 Window Coordinates
10.3.4 Screen Locations
10.3.5 Bounding Rectangles
10.3.6 The Pixel Cursor
10.4 Graphics Functions
10.4.1 Controlling Video Modes
10.4.2 Changing Colors
10.4.3 Drawing Points, Lines, and Shapes
10.4.4 Defining Patterns
10.4.5 Manipulating Images
10.5 Using Graphic Fonts
10.5.1 Using the C Font Library
10.5.2 Registering the Fonts
10.5.3 Setting the Current Font
10.5.4 Displaying Text
10.5.5 A Sample Program
10.5.6 Using Fonts Effectively
Chapter 11 Creating Charts and Graphs
11.1 Overview of Presentation Graphics
11.2 Parts of a Graph
11.3 Writing a Presentation Graphics Program
11.3.1 Pie Chart
11.3.2 Bar, Column, and Line Charts
11.3.3 Scatter Diagram
11.4 Manipulating Colors and Patterns
11.4.1 Color Pool
11.4.2 Style Pool
11.4.3 Pattern Pool
11.4.4 Character Pool
11.5 Customizing the Chart Environment
11.5.1 titletype Structures
11.5.2 axistype Structures
11.5.3 windowtype Structures
11.5.4 legendtype Structures
11.5.5 chartenv Structures
Chapter 12 Programming with Mixed Languages
12.1 Making Mixed-Language Calls
12.2 Language Convention Requirements
12.2.1 Naming Convention Requirement
12.2.2 Calling Convention Requirement
12.2.3 Parameter-Passing Requirement
12.3 Compiling and Linking
12.3.1 Compiling with Correct Memory Models
12.3.2 Linking with Language Libraries
12.4 C Calls to High-Level Languages
12.5 C Calls to BASIC
12.6 C Calls to FORTRAN
12.6.1 Calling a FORTRAN Subroutine from C
12.6.2 Calling a FORTRAN Function from C
12.7 C Calls to Pascal
12.7.1 Calling a Pascal Procedure from C
12.7.2 Calling a Pascal Function from C
12.8 C Calls to Assembly Language
12.8.1 Writing the Assembly-Language Procedure
12.8.2 Setting Up the Procedure
12.8.3 Entering the Procedure
12.8.4 Allocating Local Data
12.8.5 Preserving Register Values
12.8.6 Accessing Parameters
12.8.7 Returning a Value
12.8.8 Exiting the Procedure
12.9 Handling Data in Mixed-Language Programming
12.9.1 Default Naming and Calling Conventions
12.9.2 Numeric Data Representation
12.9.3 Strings
12.9.4 Arrays
12.9.5 Array Declaration and Indexing
12.9.6 Structures, Records, and User-Defined Types
12.9.7 External Data
12.9.8 Pointers and Address Variables
12.9.9 Common Blocks
12.9.10 Using a Varying Number of Parameters
Chapter 13 Writing Portable Programs
13.1 Assumptions about Hardware
13.1.1 Size of Basic Types
13.1.2 Storage Order and Alignment
13.1.3 Byte Order in a Word
13.1.4 Reading and Writing Structures
13.1.5 Bit Fields in Structures
13.1.6 Processor Arithmetic Mode
13.1.7 Pointers
13.1.8 Address Space
13.1.9 Character Set
13.2 Assumptions about the Compiler
13.2.1 Sign Extension
13.2.2 Length and Case of Identifiers
13.2.3 Register Variables
13.2.4 Functions with a Variable Number of Arguments
13.2.5 Evaluation Order
13.2.6 Function and Macro Arguments with Side Effects
13.2.7 Environment Differences
13.3 Portability of Data Files
13.4 Portability Concerns Specific to Microsoft C
13.5 Microsoft C Byte Ordering
PART IV OS/2 Support
────────────────────────────────────────────────────────────────────────────
Chapter 14 Building OS/2 Applications
14.1 The OS/2 Applications Program Interface
14.1.1 Calling the OS/2 API
14.1.2 Including the OS/2 Header Files
14.1.3 Creating Dual-Mode Programs as Family Applications
14.2 Compile Options for the CL Command
14.2.1 The Link Mode Options (/Lp, /Lr, and /Lc)
14.2.2 Creating Bound Programs Option (/Fb)
14.2.3 Library Selection Options (/MT, /ML, /MD, /Zl)
14.2.4 Memory-Model Options (/Ax)
14.3 Module-Definition Files and Import Libraries
14.3.1 Adding a Module-Definition File to the LINK Command
14.3.2 Creating Dynamic-Link Libraries (DLLs)
14.3.3 Creating Programs with I/O Privileges
14.3.4 Creating Presentation Manager Applications
14.3.5 Creating Import Libraries with the IMPLIB Utility
14.4 Link Command-Line Options
14.5 The BIND Utility
Chapter 15 Creating Multithread OS/2 Applications
15.1 Multithread Programs
15.1.1 Library Support
15.1.2 Include Files
15.1.3 C Run-Time Library Functions for Thread Control
15.2 Sample Multithread C Program
15.3 Writing a Multithread Program
15.4 Compiling and Linking
15.5 Avoiding Problem Areas
15.6 Using the Protected-Mode CodeView Debugger
15.6.1 Compiling with the /Zi Option
15.6.2 Prompt for Thread Number
15.6.3 Thread Commands
15.6.4 Screen Groups Used by CodeView
Chapter 16 Dynamic Linking with OS/2
16.1 Overview of Dynamic Linking
16.1.1 Load-Time and Run-Time Linking
16.1.2 Application Programs and DLLs
16.1.3 DLLs and Microsoft C Run-Time Libraries
16.2 Designing and Writing DLLs
16.2.1 Floating-Point Math Requirements
16.2.2 Initialization and Termination Requirements
16.2.3 Making the DLL Re-Entrant
16.2.4 Signal Handling
16.2.5 Using Microsoft C Keywords
16.2.6 Compile Options for Dynamic-Link Libraries
16.3 Building DLLs with Microsoft C
16.3.1 DLLs with Static C Run-Time Library Functions
16.3.2 DLLs without C Run-Time Library Functions
16.3.3 Programs and DLLs with a C Run-Time DLL
16.3.4 Using CodeView to Debug Dynamic-Link Libraries
Appendix A Using Exit Codes
A.1 The exit Function
A.2 Testing Exit Codes from Command and Batch Files
A.3 Accessing Exit Codes from Other Programs
Appendix B Differences between C Versions 5.1 and 6.0
B.1 Modifications for ANSI Compatibility
B.1.1 ANSI-Mandated New Features
B.1.2 Integer Promotion Rules
B.1.3 Defining NULL as a Pointer
B.1.4 Shift Operators
B.1.5 Pointers to Typedefs
B.1.6 Identifying Nonstandard Keywords
B.1.7 Trigraphs
B.1.8 ANSI Nonconformance
B.2 New Keywords and Functions
B.2.1 In-Line Assembler
B.2.2 Based Pointers and Objects
B.2.3 Based Heap Allocation Support
B.2.4 Releasing Unused Heap Memory
B.2.5 Making Static Data Available to the Heap
B.2.6 Long Doubles
B.2.7 Long Double Functions
B.2.8 Model-Independent String and Memory Functions
B.2.9 Mixed-Model Memory Allocation Support
B.2.10 The _fastcall Attribute (/Gr Option)
B.2.11 Drive and Directory Functions
B.2.12 Text Output Functions for OS/2
B.3 New Features
B.3.1 Strings and Macros
B.3.2 CL Options
B.3.3 Tiny Memory Model (.COM Files)
B.3.4 The Optimize Pragma
B.3.5 Nameless Structures and Unions
B.3.6 Unsized Arrays as the Last Member of a Structure
B.3.7 Improved Warnings
B.3.8 Macros
B.3.9 Improved Multithread Support in OS/2
B.3.10 Pipe Support in OS/2
B.4 Differences in Code Generation
B.4.1 Speed and Space Improvements
B.4.2 Code Quality
B.4.3 Floating-Point Code Generation
B.4.4 Intrinsic Functions
B.5 Changes and Deletions
B.5.1 Deleted Features
B.5.2 Evaluation of Real Expressions
B.5.3 Default Optimizations
B.5.4 Sign Extension of char Arguments
B.5.5 Conditional Compilation and Signed Values
B.5.6 The const and volatile Qualifiers
B.5.7 Memory Allocation
B.5.8 Memory Used by Command-Line Arguments
B.5.9 Format Specifiers in printf
B.5.10 Functions that Return Float Values
Appendix C Implementation-Defined Behavior
C.1 Translation
C.1.1 Diagnostics
C.2 Environment
C.2.1 Arguments to main
C.2.2 Interactive Devices
C.3 Identifiers
C.3.1 Significant Characters without External Linkage
C.3.2 Significant Characters with External Linkage
C.3.3 Upper- and Lowercase
C.4 Characters
C.4.1 The ASCII Character Set
C.4.2 Multibyte Characters
C.4.3 Bits per Character
C.4.4 Character Sets
C.4.5 Unrepresented Character Constants
C.4.6 Wide Characters
C.4.7 Converting Multibyte Characters
C.4.8 Range of char Values
C.5 Integers
C.5.1 Range of Integer Values
C.5.2 Demotion of Integers
C.5.3 Signed Bitwise Operations
C.5.4 Remainders
C.5.5 Right Shifts
C.6 Floating-Point Math
C.6.1 Values
C.6.2 Casting Integers to Floating-Point Values
C.6.3 Truncation of Floating-Point Values
C.7 Arrays and Pointers
C.7.1 Largest Array Size
C.7.2 Casting Pointers
C.7.3 Pointer Subtraction
C.8 Registers
C.8.1 Availability of Registers
C.9 Structures, Unions, Enumerations, and Bit Fields
C.9.1 Improper Access to a Union
C.9.2 Sign of Bit Fields
C.9.3 Storage of Bit Fields
C.9.4 Alignment of Bit Fields
C.9.5 The enum Type
C.10 Qualifiers
C.10.1 Access to Volatile Objects
C.11 Declarators
C.11.1 Maximum Number
C.12 Statements
C.12.1 Limits on Switch Statements
C.13 Preprocessing Directives
C.13.1 Character Constants and Conditional Inclusion
C.13.2 Including Bracketed File Names
C.13.3 Including Quoted File Names
C.13.4 Character Sequences
C.13.5 Pragmas
C.13.6 Default Date and Time
C.14 Library Functions
C.14.1 NULL Macro
C.14.2 Diagnostic Printed by the assert Function
C.14.3 Character Testing
C.14.4 Domain Errors
C.14.5 Underflow of Floating-Point Values
C.14.6 The fmod Function
C.14.7 The signal Function
C.14.8 Default Signals
C.14.9 The SIGILL Signal
C.14.10 Terminating Newline Characters
C.14.11 Blank Lines
C.14.12 Null Characters
C.14.13 File Position in Append Mode
C.14.14 Truncation of Text Files
C.14.15 File Buffering
C.14.16 Zero-Length Files
C.14.17 File Names
C.14.18 File Access Limits
C.14.19 Deleting Open Files
C.14.20 Renaming with a Name that Exists
C.14.21 Printing Pointer Values
C.14.22 Reading Pointer Values
C.14.23 Reading Ranges
C.14.24 File Position Errors
C.14.25 Messages Generated by the perror Function
C.14.26 Allocating Zero Memory
C.14.27 The abort Function
C.14.28 The atexit Function
C.14.29 Environment Names
C.14.30 The system Function
C.14.31 The strerror Function
C.14.32 The Time Zone
C.14.33 The clock Function
Index
Introduction
────────────────────────────────────────────────────────────────────────────
Advanced Programming Techniques describes how to get the most out of the
Microsoft(R) C Professional Development System with its new integrated
development environment─the Microsoft Programmer's WorkBench─and
source-level debugging tool─the CodeView(R) debugger.
In this manual, you will see how all the components of the Microsoft C
Professional Development System work together to provide you with the most
powerful development environment available. A key element in the power of
the Professional Development System is your ability to customize it to suit
your individual needs as a programmer.
Because this book is arranged by topic, it answers questions about using
Microsoft C version 6.0, rather than providing lists of options. If you have
specific questions about menu items in the CodeView debugger, the
Programmer's WorkBench, or any of the command-line utilities included in the
Professional Development System, you can use the Microsoft C Advisor
(on-line help) or the C Reference manual.
Advanced Programming Techniques shows you how tools and utilities all fit
together.
Scope of This Book
Advanced Programming Techniques is divided into four parts. Part 1,
"Improving Program Performance," helps you write more efficient programs. It
provides specific information about optimizing─when and why to use various
optimizing options. It also explains new memory management options and when
to use them. For example, Chapter 3 describes the in-line assembler, a new
feature that lets you mix assembly language with your C source code.
Part 2, "Improving Programmer Productivity," will help you perform
programming tasks more quickly and efficiently. Chapter 8 explains the
different ways you can customize the new Programmer's WorkBench (PWB)─an
editor and integrated development environment that allows you to
■ Create new programs
■ Modify existing programs
■ Browse source files
■ Obtain help about PWB, the C language, and the C run-time libraries
■ Set program build lists
■ Build programs
■ Debug programs with the CodeView debugger
Chapter 8 also describes how to change PWB behavior to suit your programming
style by making keyboard assignments, recording or writing macros, and
writing C extensions.
Also in Part 2 is a chapter about the Microsoft Program Maintenance Utility,
NMAKE. NMAKE is a new program maintenance facility that allows you to use
program lists as input, which provides extra flexibility in your program
build process. It is a superset of the Microsoft XENIX(R) MAKE utility and
is substantially more powerful than previous versions of MAKE.
Chapter 9 in Part 2 describes the CodeView debugger, which is even more
powerful than in previous releases. With CodeView version 3.0, you get many
new features, including the ability to record a debugging session, then play
it back (history and dynamic replay).
Part 3, "Special Environments," describes new graphics capabilities. It also
shows how to program in mixed languages and offers tips to make your
programs more portable. Microsoft C helps you create graphics applications
easily. The Microsoft C run-time libraries contain graphics functions for
low-level graphics operations, such as drawing lines, rectangles, and
circles. The libraries also contain functions for creating presentation
graphics, such as pie charts and bar charts.
Part 4, "OS/2 Support," describes how the Professional Development System
helps you build OS/2 applications. The three chapters in Part 4 provide
information about dual-mode applications, creating multithread applications,
and creating dynamic-link libraries.
A postage-paid documentation feedback card is at the end of this manual.
After you have had a chance to become familiar with Microsoft C 6.0 and its
documentation, please give us your opinion. Your ideas will help us as we
develop future documentation. Also at the end of this book is a Product
Assistance Request form. If you need to call Microsoft for assistance, use
this form first to compile and organize pertinent information.
Document Conventions
────────────────────────────────────────────────────────────────────────────
NOTE
The pages that follow use the term "OS/2" to refer to the OS/2
systems─Microsoft Operating System/2 (MS(R) OS/2) and IBM(R) OS/2.
Similarly, the term "DOS" refers to both the MS-DOS(R) and IBM Personal
Computer DOS operating systems. The name of a specific operating system is
used when it is necessary to note features that are unique to the system.
────────────────────────────────────────────────────────────────────────────
Example Description
────────────────────────────────────────────────────────────────────────────
STDIO.H Uppercase letters indicate file names,
segment names, registers, and terms used
at the DOS- or OS/2-command level.
_cdecl Boldface letters indicate C keywords,
operators, language-specific characters,
and library functions, as well as OS/2
functions. Within discussions of syntax,
bold type indicates that the text must
be entered exactly as shown.
expression Words in italics indicate placeholders
for information you must supply, such as
a file name. Italics are also
occasionally used for emphasis in the
text.
«option» Items inside double square brackets are
optional.
#pragma pack {1|2} Braces and a vertical bar indicate a
choice among two or more items. You must
choose one of these items unless double
square brackets surround the braces.
CL A.C B.C C.OBJ This font is used for examples, user
input, program output, and error
messages in text.
CL options « files...» A horizontal ellipsis following an item
indicates that more items having the
same form may follow.
while( ) A vertical ellipsis tells you that part
{ of the example program has been
. intentionally omitted.
.
.
}
CTRL+ENTER Small capital letters are used for the
names of keys on the keyboard. When you
see a plus sign (+) between two key
names, you should hold down the first
key while pressing the second.
The carriage-return key (sometimes
appearing as a bent arrow on the
keyboard) is called ENTER.
The cursor-movement keys (sometimes
called direction keys) are called the
ARROW keys. Individual keys are referred
to by their direction (LEFT, UP) or by
the name on the key (PGUP).
"argument" Quotation marks enclose a new term the
first time it is defined in text.
Enhanced Graphics Adapter (EGA) The first time an acronym is used, it is
often spelled out.
PART I Improving Program Performance
────────────────────────────────────────────────────────────────────────────
The Microsoft C Professional Development System helps you create the
fastest, smallest applications using its sophisticated optimizer and
enhanced memory management capabilities.
Chapter 1 tells when to use certain optimizations and describes how
Microsoft C generates code that is efficient in execution speed and size.
Chapter 2 explains the sophisticated tools Microsoft C gives you to allocate
and manage program memory, including the new _based type. For cases where
your program requires localized optimization, you can use the in-line
assembler, described in Chapter 3, to introduce the tightest possible code.
If your application requires floating-point math computations, you will find
Chapter 4 helpful in explaining the options in the Microsoft C math
packages; it explains which floating-point options yield the fastest,
smallest, and most flexible code.
Chapter 1 Optimizing C Programs
────────────────────────────────────────────────────────────────────────────
The Microsoft C compiler translates C source statements into
machineexecutable instructions. In addition, the compiler rewrites or
"optimizes" parts of your program to make it more efficient in ways that
are not apparent at the source level.
The compiler performs three general types of optimization:
1. It modifies or moves sections of code so that fewer instructions are
used, or so that the instructions used make more efficient use of the
processor.
2. It moves code and combines operations to maximize use of registers
because operations on data stored in processor registers are far
faster than the same operations on data stored in memory.
3. It eliminates sections of code that are redundant or unused.
This chapter explains the various ways you can control how the Microsoft C
compiler optimizes your code.
1.1 Controlling Optimization from the Programmer's WorkBench
The Programmer's WorkBench (PWB) is an integrated development environment
for editing, building, and debugging applications written in Microsoft C.
For more information on the PWB, see Installing and Using the Microsoft C
Professional Development System.
There are two ways to compile from inside the Programmer's WorkBench:
1. Debug compile. In a default debug compile, the compiler performs no
optimizations at all.
2. Release compile. In a default release compile, the compiler performs
most optimizations.
By modifying the settings in C Global Build Options, C Debug Build Options,
and C Release Build Options (on the Options menu), you can fine-tune
optimization by individually enabling or disabling any of the optimizations
the compiler performs.
The optimizations in each of the Build Options dialog boxes correspond to a
command-line option to CL. (In fact, the PWB constructs a command line from
your input and passes it to CL.)
────────────────────────────────────────────────────────────────────────────
NOTE
In this chapter, optimization options are discussed in terms of the effect
of the optimization, the command-line option to invoke the optimization, and
pragmas that control the optimization. All of these optimizations can be
controlled at the compilation-unit (file) level using the Build Options
dialog boxes.
────────────────────────────────────────────────────────────────────────────
1.2 Controlling Optimization from the Command Line
Controlling optimization from the command line requires that you determine
which optimizations you need for your application. You then specify those
optimizations using command-line options that begin with /O (and in some
cases /G).
If there is any conflict between options, the compiler uses the last option
specified on the command line. The command line
CL /Oa /Ol /Ot TEST.C
compiles the program TEST.C. It specifies that the compiler can
■ Optimize on the assumption that you are doing no aliasing (/Oa)
■ Perform loop optimization (/Ol)
■ Perform other general speed-enhancing optimizations (/Ot)
The preceding command line can also be written
CL /Oalt TEST.C
1.3 Controlling Optimization with Pragmas
Occasionally you will need to exercise a fine level of control over compiler
optimizations. Command-line options allow you to control optimization over
an entire compilation unit (file). In addition, Microsoft C supports several
pragmas that allow you to exercise such control on a per-function basis.
The pragmas that control optimization are described in this chapter under
the type of optimization they affect.
The optimize pragma is new to version 6.0.
In version 6.0, you can control each of the following optimization
parameters on a function-by-function basis using the optimize pragma:
■ Behavior of code with respect to aliasing (a and w)
■ Reduction of local common subexpressions (c)
■ Reduction of global common subexpressions (g)
■ Global register allocation (e)
■ Loop optimization (l)
■ Aggressiveness of optimizations (z)
■ Disabling of unsafe optimizations (n)
■ Achieving consistent floating-point results (p)
■ Optimizing for smaller code size or for faster execution speed (t)
Any optimization or combination of options can be enabled or disabled using
the optimize pragma. For example, if you have one function that uses aliases
heavily, you need to inhibit optimizations that could cause problems with
aliases. You do not, however, want to inhibit these optimizations for code
that does not do aliasing. To do this, use the optimize pragma as follows:
/* Function(s) that do not do aliasing. */
.
.
.
#pragma optimize( "a", off )
/* Function(s) that do aliasing. */
.
.
.
#pragma optimize( "a", on )
/* More function(s) that do not do aliasing. */
The parameters to the optimize pragma can be combined in a string to enable
or disable multiple options at once. For example,
#pragma optimize( "lge", off )
disables loop optimization, global common subexpression optimization, and
global register allocation.
1.4 Default Optimization
Many optimizations are not explicitly disabled by any command-line option
except /Od (disable optimizations). These optimizations are small in scope
and are almost always helpful. They include
■ Short range common subexpression elimination
■ Dead-store elimination
■ Constant propagation
1.4.1 Common Subexpression Elimination
In common subexpression elimination, the compiler finds code containing
repeated subexpressions and produces modified code in which the
subexpressions are evaluated only once. Subexpression elimination is usually
done with temporary variables as shown in the following example:
a = b + c * d;
x = c * d / y;
The preceding two lines contain the common subexpression c * d. This code
can be modified to evaluate c * d only once; the result is placed in a
temporary variable (usually a register):
tmp = c * d;
a = b + tmp;
x = tmp / y;
1.4.2 Dead-Store Elimination
Dead-store elimination is an extension of common subexpression elimination.
Variables that contain the same value in a short piece of code can be
combined into a single temporary variable.
In the following code fragment, the compiler detects that the expression
func( x ) is equivalent to func( a + b ):
x = a + b;
x = func( x );
Thus, the compiler can rewrite the code as follows:
x = func( a + b);
1.4.3 Constant Propagation
When doing constant propagation, the compiler analyzes variable assignments
and determines if they can be changed to constant assignments. In the
following example, the variable i must have a value of 7 when it is
assigned to j:
i = 7;
j = i;
Instead of assigning i to j, the constant 7 can be assigned to j:
i = 7;
j = 7;
While you could make any of these changes in the source file, doing so might
reduce the readability of the program. In many cases, optimizations not only
increase the efficiency of the program but allow you to write more readable
code without any actual efficiency loss.
Remove optimization before using a symbolic debugger.
In some cases, you might want to disable even the default optimizations.
Because optimizations may rearrange code in the object file, it can become
difficult to recognize parts of your code during debugging. It is usually
best to remove all optimization before using a symbolic debugger. You can
remove all optimization with the /Od (disable optimizations) option.
You can disable all optimizations for a function by including the statement
#pragma optimize( "", off ). To restore optimization to its former state,
use the statement #pragma optimize( "", on ).
1.5 Customizing Your Optimizations
The default optimizations are sufficient for many applications, but you may
want to tune your programs according to criteria not known to the compiler.
The optimization options offer you a way of providing the compiler specific
goals for optimizing your code.
1.5.1 Choosing Speed or Size (/Ot and /Os)
In addition to the default optimizations, the Microsoft C compiler also
automatically uses the /Ot option, which optimizes for speed. The /Ot option
enables optimizations that increase speed but may also increase size. If you
would rather optimize for program size, use the /Os option. The /Os option
enables optimizations that decrease program size but may also decrease
program speed.
To optimize for speed or size on a per-function basis, use the optimize
pragma with the t option. The on setting instructs the compiler to optimize
for speed; the off setting instructs the compiler to optimize for
compactness of code. For example,
#pragma optimize( "t", off ) /* Optimize for smallest
code. */
.
.
.
#pragma optimize( "t", on ) /* Optimize for fastest
code. */
1.5.2 Generating Intrinsic Functions (/Oi)
In place of some normal function calls, the C compiler can insert "intrinsic
functions," which operate more quickly. Every time a function is called, a
set of instructions must be executed to store parameters and to create space
for local variables. When the function returns, more code must be executed
to release space used by local variables and parameters and to return values
to the calling routine. These instructions take time to execute. In the
context of an average-sized function, the additional code is minimal, but if
the function is only a line or two, the additional code can comprise almost
half of the function's compiled code.
One way to avoid this type of code expansion is to avoid such short
functions, especially in often-used sections of code where speed is
critical. But many library functions contain only a line or two of code. The
compiler provides two forms of certain library functions. One form is a
standard C function, which requires the overhead of a function call. The
other form is a set of instructions that
performs the same action as the function without issuing a function call.
This second form is called an intrinsic function. Intrinsic functions are
always faster than their function-call equivalents and can provide
significant optimizations at the object-code level.
For example, the function strcpy might be written as follows:
int strcpy(char * dest, char * source)
{
while( *dest++ = *source++ );
}
The compiler contains an intrinsic form of strcpy. If you instruct the
compiler to generate intrinsic functions, any call to strcpy will be
replaced with this intrin-sic form.
────────────────────────────────────────────────────────────────────────────
NOTE
While the example above is written in C for clarity, most of the library
functions use assembly language to take full advantage of the 80x86
instruction set. Intrinsic functions are not simply library functions
defined as macros.
────────────────────────────────────────────────────────────────────────────
Compiling with the /Oi option causes the compiler to use the intrinsic forms
of the following functions:
abs labs memset strcat
_disable lrotl outp strcmp
_enable lrotr outpw strcpy
fabs memcmp rotl strlen
inp memcpy rotr strset
inpw
While the following floating-point functions do not have true intrinsic
forms, they do have versions that pass arguments directly to the
floating-point chip instead of pushing them on the normal argument stack:
acos fmod acosl fmodl
asin log asinl logl
atan log10 atanl log10l
atan2 pow atan2l powl
ceil sin ceill sinl
cos sinh cosl sinhl
cosh sqrt coshl sqrtl
exp tan expl tanl
floor tanh floorl tanhl
────────────────────────────────────────────────────────────────────────────
WARNING
The compiler performs optimizations assuming math intrinsics have no side
effects. This assumption is true except if you have written your own matherr
function and that function alters global variables. If you have written a
matherr function to handle floating-point errors, and your function has side
effects, use the function pragma to instruct the compiler not to generate
intrinsic code for math functions.
────────────────────────────────────────────────────────────────────────────
If you want the compiler to generate intrinsic functions for only a subset
of the functions listed above, use the intrinsic pragma rather than the /Oi
option. The intrinsic pragma has the following format:
#pragma intrinsic( function1, ... )
If you want to have intrinsic functions generated for most of the functions
above and function calls for only a few, compile with the /Oi option and
force function use with the function pragma. The function pragma has the
following format:
#pragma function( function1, ... )
The following code illustrates the use of the intrinsic pragma:
#pragma intrinsic(abs)
void main( void )
{
int i, j;
i = big_routine_1();
j = abs( i );
big_routine_2( j );
}
Generating intrinsic functions for this program causes the call to abs to be
replaced with assembly-language code that takes the absolute value of a
number. The program will execute more quickly because the function-calling
overhead is no longer required when abs is called.
In the previous example, the overall speed increase is small because there
is only a single call to abs. In the following example, where the call to
abs is in a loop and there are many calls, you can save a significant amount
of execution time by generating intrinsic functions.
#pragma intrinsic( abs )
void main( void )
{
int i, j, x;
for( j = 0; j < 1000; j++ )
{
for( i = 0; i < 1000; i++)
{
x += abs( i - j );
}
}
printf( "The value of x is %d\n", x );
}
The following is a list of restrictions on using the intrinsic forms of
function calls:
■ Do not use the intrinsic forms of the floating-point math functions
with the alternate math libraries (mLIBCAy.LIB).
■ Do not use the intrinsic forms of the floating-point math functions in
OS/2 dynamic-link libraries (DLLs) because you must use the alternate
math library with LLIBCDLL.LIB.
■ If you use the /Ox (maximum optimization) option, you are enabling the
/Oi (generate intrinsic functions) option. Be careful that your use of
/Ox does not conflict with the points listed previously.
────────────────────────────────────────────────────────────────────────────
NOTE
Intrinsic versions of _enable, _disable, inp, outp, inpw, and outpw
do not work under OS/2. You must use the library versions. You can use the
function pragma to force these functions to become library calls.
────────────────────────────────────────────────────────────────────────────
1.5.3 Assuming No Aliasing (/Oa and /Ow)
An "alias" is a name used to refer to a memory location already referred to
by a different name. Because a memory access takes more time than it takes
to access the CPU's registers, the compiler tries to store frequently used
variables in registers. However, the aliasing reduces the extent to which a
compiler can keep variables in registers.
A pointer is a reference to a memory location. Because the value of a
pointer is not determined until the program is run, the compiler has no way
of knowing which memory location will be modified when the program executes;
it could be a reference to a variable. Therefore, the compiler must assume
that any time the value pointed to by any pointer changes, the value of any
variable might also change. This limits the extent to which the compiler can
move values from memory to registers.
The /Oa option tells the compiler to ignore the possibility of multiple
aliases for a memory location. In the list that follows, the term
"reference" means read or write; that is, whether a variable is on the
left-hand side of an assignment statement or the right-hand side, you are
still referring to it. In addition, any function calls that use a variable
as a parameter are references to that variable. When you tell the compiler
to assume that you are not doing aliasing, it expects that the following
rules are being followed for any variable not declared as volatile:
■ If a variable is used directly, no pointers are used to reference that
variable.
■ If a pointer is used to refer to a variable, that variable is not
referred to directly.
■ If a pointer is used to modify a memory location, no other pointers
are used to access the same memory location.
To clarify how these rules affect your code, consider the following example:
char p;
char *ptr_p;
ptr_p = &p; /* Take the address of p. */
You can now refer either to *ptr_p or to p, but not to both within the
same function. If you must refer to the variable by both names, you are
using aliases.
Code referring to the same location with two pointers uses aliases. For
example,
char *p_buf;
char *p_alias;
if( (p_alias = p_buf = malloc( 5000 )) == NULL )
return;
else
{
.
.
.
}
The code in the example above is common. It demonstrates dynamically
allocating a block of memory from the heap, and preserving the original
address in p_buf. The program then performs all pointer arithmetic on the
alias p_alias. When the function finishes with the block of memory, p_buf
is a valid argument for the free function because it still contains the
original address.
The /Oa and /Ow options tell the compiler that you have not used aliases in
your code.
The difference between the /Oa and the /Ow option is that when you use /Oa
you specify that you will not be doing aliasing (which allows the compiler
to perform significant optimizations that might not otherwise have been
possible), and that function calls are safe. The /Ow option is similar to
the /Oa option, except that after a function call, pointer variables must be
reloaded from memory.
Here is an example of a program that would be a poor candidate for the /Oa
or /Ow optimization option:
int g;
void main( void )
{
add_em( &g );
}
int add_em( int *p )
{
*p = 2; /* Assign a value to an alias for g. */
g = 3; /* Assign a value directly to g. */
return( *p + g );
}
In the function add_em, both g and *p refer to the same memory
location. This location is first assigned 2, then 3. The value pointed to
by *p (the alias for g) is then added to g, and the result is returned
to the main program. If you do not use the /Oa command-line option, the
compiler assumes that the reference to *p could refer to the same memory
location as does g and makes no attempt to use a register to store the
value of either. If, however, you do specify the /Oa option, the compiler
assumes that g and *p refer to different memory locations and stores
each in a different register. At the return statement, g will have a
different value than *p, even though both aliases should actually contain
the same value.
Note that the compiler keeps values in registers for only a limited time. If
different aliases to a memory location occur in different functions, for
example, they will not cause unexpected results. When in doubt, avoid
aliasing.
Bugs involving aliasing are difficult to spot.
Aliasing bugs most frequently show up as corruption of data. If you find
that global or local variables are being assigned seemingly random values,
take the following steps to determine if you have a problem with
optimization and aliasing:
■ Compile the program with /Od (disable optimizations).
■ If the program works when compiled with the /Od option, check your
normal compile options for the /Oa option (assume no aliasing).
■ If you were using the /Oa option, fix your compile options so that /Oa
is not specified.
────────────────────────────────────────────────────────────────────────────
NOTE
You can instruct the compiler to disable optimizations that are unsafe with
code that does aliasing by using the optimize pragma with the a or w option.
────────────────────────────────────────────────────────────────────────────
1.5.4 Performing Loop Optimizations (/Ol)
The /Ol option enables a set of optimizations involving loops. Because loops
involve sections of code that are executed repeatedly, they are targets for
optimization. These optimizations all involve moving code or rewriting code
so that it executes faster.
Loop optimization can be turned on with the /Ol option or with the loop_opt
pragma. The following line enables loop optimization for all subsequent
functions:
#pragma loop_opt( on )
The following line turns it off:
#pragma loop_opt( off )
The /Ol option removes invariant code.
An optimal loop contains only expressions whose values change through each
execution of the loop. Any subexpression whose value is constant should be
evaluated before the body of the loop is executed. Unfortunately, these
subexpressions are not always readily apparent. The optimizer can remove
many of these expressions from the body of a loop at compile time. This
example illustrates invariant code in a loop:
i = -100;
while( i < 0 )
{
i += x + y;
}
In the preceding example, the expression x + y does not change in the loop
body. Loop optimization removes this subexpression from the body of the loop
so that it is only executed once, not every time the loop body is executed.
The optimizer will change the code to the following fragment:
i = -100;
t = x + y;
while( i < 0 )
{
i += t;
}
Loop optimization is much more effective when the compiler can assume no
aliasing. While you can use loop optimization without the /Oa or /Ow option,
use /Oa to ensure that the most options possible are used.
Here is a code fragment that could have an aliasing problem:
i = -100;
while( i < 0 )
{
i += x + y;
*p = i;
}
If you do not specify the /Oa option, the compiler must assume that either
x or y could be modified by the assignment to *p. Therefore, the
compiler cannot assume the subexpression x + y is constant for each loop
iteration. If you specify that you are not doing any aliasing (with the /Oa
option), the compiler assumes that modifying *p cannot affect either x
or y, and that the subexpression is indeed constant and can be removed from
the loop, as in the previous example.
────────────────────────────────────────────────────────────────────────────
NOTE
All loop optimizations specified by the /Ol option or the loop_opt pragma
are safe optimizations. To enable aggressive loop optimizations, you must
use the enable aggressive optimizations (/Oz) option. While the
optimizations enabled by the combination of /Ol and /Oz are not safe for all
cases, they will work properly for most programs.
────────────────────────────────────────────────────────────────────────────
1.5.5 Disabling Unsafe Loop Optimizations (/On)
The disable unsafe loop optimizations (/On) option is an obsolescent option
and is only retained for compatibility with existing makefiles. Loop
optimizations are, by default, safe optimizations. The /On option is the
default and has the opposite effect of the /Oz (enable aggressive
optimizations) option.
1.5.6 Enabling Aggressive Optimizations (/Oz)
The compiler can perform extremely aggressive optimizations. These
optimizations produce high code quality both in terms of speed and size.
Certain programs, however, cannot be optimized with the technologies enabled
by the /Oz option. For these programs, you should not specify this option;
you can still use all other optimization options.
Because the optimization strategies enabled by the /Oz option are so
aggressive, they are not part of the maximum optimization (/Ox) option.
Examples of the effects of the /Oz option are
■ Loop optimization (/Ol). Loop optimization enables a technology that
anticipates program flow and tries to remove invariant expressions
from loops. When you specify the enable aggressive optimizations
option (/Oz), the compiler removes invariant expressions even when it
might cause an error. Errors with the enable aggressive optimizations
option occur most often when an invariant expression that can cause an
exception is protected by an if statement. The invariant expression is
hoisted out of the loop body, causing it to be evaluated prior to the
evaluation of the if statement that was designed to protect it. Here
are two examples that illustrate this problem:
for( i = 0; i 100; ++i )
if( float_val != 0.0F )
/* Protect against divide-by-zero. */
float_result = pi / float_val;
while( condition )
if( ptr_val != NULL )
/* Protect pointer dereference. */
char_var = *ptr_val;
■ Global register allocation (/Oe). The enable aggressive optimizations
option enables some register allocation strategies that can cause
invalid segment selectors to be placed in registers. Although this
problem is benign in DOS, it causes protection faults in OS/2.
────────────────────────────────────────────────────────────────────────────
NOTE
You can instruct the compiler to enable aggressive optimizations on a
function-by-function basis by using the optimize pragma with the z option.
────────────────────────────────────────────────────────────────────────────
1.5.7 Removing Stack Probes (/Gs)
Every time a function is called, the stack provides space for all parameters
and local variables declared in that function. A short assembly function
that checks for a stack overflow condition is then called. Stack overflows
are usually caused either by infinite loops or by runaway recursive
routines. Such errors can also be caused by extremely large parameters or
local variables.
Stack probes can be important during program development. Stack-overflow
errors alert you to problems in your code. When the program has been tested,
however, stack checking often becomes unnecessary. The compiler allows you
to remove stack-checking code with either the /Gs option or the check_stack
pragma. Eliminating stack probes produces programs that are smaller and that
run more quickly.
1.5.8 Enabling Global Register Allocation (/Oe)
The global register allocation option (/Oe) instructs the compiler to
analyze your program and allocate CPU registers as efficiently as possible.
Without the global register allocation option, the compiler uses the CPU's
registers for several purposes:
■ Holding temporary copies of variables
■ Holding variables declared with the register keyword
■ Passing parameters to functions declared with the _fastcall keyword
(or functions in programs compiled with the /Gr command-line option)
When you enable global register allocation, the compiler ignores the
register keyword and allocates register storage to variables (and possibly
to common subexpressions). The compiler allocates register storage to
variables or subexpressions according to frequency of use. Because of the
limited number of physical registers, variables held in registers are
sometimes placed back in memory to free the register for another use. Here
is a C program example that demonstrates how the compiler might rewrite your
code to accomplish this:
/* Original program */
func()
{
int i, j;
char *pc;
for( i = 0; i < 1000; ++i )
{
j = i / 3;
*pc++ = (char)i;
}
for( j = 0, --pc; j < 1000;
++j, --pc )
*pc--;
}
/* Example of how the compiler might optimize the
* code to move i and j in and out of registers */
func()
{
int i, j;
char *pc;
{
register int i; /* i is in a register for this block. */
for( i = 0; i < 1000; ++i )
{
j = i / 3;
*pc++ = (char)i;
}
}
{
register int j; /* j is in a register for this block. */
for( j = 0, --pc; j < 1000;
++j, --pc )
*pc--;
}
}
In the preceding example, there are blocks (enclosed in curly braces) whose
only purpose is to delimit the span of code across which variables should
remain in registers.
────────────────────────────────────────────────────────────────────────────
NOTE
You can enable or disable global register allocation on a
function-by-function basis using the optimize pragma with the e option.
────────────────────────────────────────────────────────────────────────────
1.5.9 Enabling Common Subexpression Optimization (/Oc and /Og)
When you use option /Og (enable global common subexpression optimizations),
the compiler searches entire functions for common subexpressions. Option /Oc
(default common subexpression optimization) examines only short sections of
code for common subexpressions. You can disable default common subexpression
optimization with the /Od option. For more information about common
subexpression optimization, see Section 1.4, "Default Optimization."
────────────────────────────────────────────────────────────────────────────
NOTE
You can enable or disable block-scope common subexpression optimization on a
function-by-function basis using the optimize pragma with the c option. You
can enable or disable global common subexpression optimization on a
function-by-function basis using the optimize pragma with the g option.
────────────────────────────────────────────────────────────────────────────
1.5.10 Achieving Consistent Floating-Point Results (/Op)
Floating-point numbers stored in memory use either 32, 64, or 80 bits,
depending on whether they are of type float, type double, or type long
double. The 80x87 family of coprocessors uses 80-bit registers for all
operations. If a value of type float or type double is kept in these
registers through a number of operations, it will be more accurate than if
that value is moved to and from memory between operations.
Because of the difference in precision between memory and register
representation of a floating-point number, a value stored in memory is not
always equal to the same value in the 80x87 register.
The difference in precision primarily affects strict equality or strict
inequality tests (== and !=); however, relational tests of magnitude (>, >=,
, and ) can behave erroneously if the coprocessor is able to maintain
significant digits that memory variables cannot.
You can avoid the difference in precision by using the /Op option. This
option forces floating-point values to be written to memory between
floating-point operations. While storing these values to memory reduces the
precision of floating-point expressions, it also ensures that these
expressions will produce consistent results regardless of the rest of the
code.
You can change the handling of floating-point results on a
function-by-function basis using the optimize pragma with the p option.
────────────────────────────────────────────────────────────────────────────
NOTE
Using the /Op option suppresses other optimizations because the
floating-point registers are not available for storage of intermediate
results. Because you suppress these optimizations, code compiled with the
/Op option executes more slowly than code compiled without this option.
Careful coding practices, especially in tests of strict equality and
inequality, can alleviate the need for this option.
────────────────────────────────────────────────────────────────────────────
1.5.11 Using the 80186, 80188, or 80286 Processor (/G0, /G1, /G2)
The compiler generates 8086 object code (/G0) unless you take special steps.
Because the newer processors (the 80186, 80188, and 80286) are
backwardcompatible with the 8086 instruction set, using this instruction set
ensures compatibility with all 80x86-based computers. While you gain
compatibility across the entire family of 80x86 processors, you lose the
advantage of some of the more powerful instructions in the newer processors.
If you know your program will only be running on an 80186, 80188, or 80286
processor, you can cause the compiler to generate instructions specific to
these processors. These instructions increase the speed of your program, but
you lose compatibility with machines that use older processors in the 80x86
family. Table 1.1 lists the options for processor-specific code generation:
Table 1.1 Processor Compatibility
╓┌──────────────────────┌────────────────────────────────────────────────────╖
Command-Line Option Compatible Processors
────────────────────────────────────────────────────────────────────────────
/G0 8088, 8086, 80188, 80186, 80286, 80388, 80486
/G1 80188, 80186, 80286, 80386, 80486
Command-Line Option Compatible Processors
────────────────────────────────────────────────────────────────────────────
/G1 80188, 80186, 80286, 80386, 80486
/G2 80286, 80386, 80486
────────────────────────────────────────────────────────────────────────────
────────────────────────────────────────────────────────────────────────────
NOTE
When developing only for OS/2, always use the /G2 option, because OS/2 does
not run on the 8086, 8088, 80186, or 80188. Do not use /G2 for Family
Applications because they might be run on machines with 8088, 8086, 80188,
or 80186 processors.
────────────────────────────────────────────────────────────────────────────
1.5.12 Optimizing for Maximum Efficiency (/Ox)
The /Ox option combines a number of different optimizations:
■ Enable global register allocation (/Oe)
■ Enable global common subexpression optimization (/Og)
■ Enable block-scoped common subexpression optimization (/Oc)
■ Generate intrinsic functions (/Oi)
■ Perform loop optimizations (/Ol)
■ Optimize for speed (/Ot)
■ Remove stack probes (/Gs)
Use /Ozax /Gr to get the fastest program.
The /Ox option does not include several optimizations that can improve code
efficiency: /Oa (assume no aliasing), /Oz (enable aggressive optimizations),
and /Gr (use fastcall calling convention). Before enabling these
optimizations, you should read the sections that describe the /Oa and /Oz
options and the fastcall calling convention to determine if they are
appropriate for your application.
Use the optimize pragma to reduce code size.
If you are more concerned with executable file size than execution time, use
the /Ox and /Gs options, then issue the optimize pragma as follows:
#pragma optimize( "t", off )
This set of options produces the smallest possible code, while also
performing some speed optimizations.
1.6 Linker (LINK) Options that Control Optimization
Most code optimization is performed before the object file is produced.
There are four optimizations that the linker can perform to speed program
execution and reduce the disk space used by an executable file.
1.6.1 Enabling Far Call Optimization (/FARCALLTRANSLATION)
You can call a function two ways. In a far call, the function is called
using both the segment and the offset of the function. This allows a program
to call a routine outside a 64K segment. In a near call, both the calling
statement and the function must be located in the same segment. Only the
offset is used to access the function; the segment address is implicit. You
can only use near calls to routines located in the same segment.
Because of the architecture of the processor, near function calls execute
faster than far calls. The decision to declare functions as near or far is
often made when selecting a memory model. As it is difficult to determine
where the linker will place a given function in memory, it is impractical
for the programmer to choose the way a function is called.
Use /FARCALLTRANSLATION with medium, large, and huge model programs.
The /FARCALLTRANSLATION option enables far call optimization. When you use
this option, any function calls within the same segment as the function
being called are converted to near calls. This optimization has no effect if
you have selected the tiny, small, or compact model, because all calls are
already near calls.
The abbreviation for the /FARCALLTRANSLATION option is /F.
How /FARCALLTRANSLATION Affects Your Code
The linker can perform a form of post-optimization (an optimization that
occurs after most of the actual code generation is complete) that translates
far calls into near calls when possible. This optimization allows a given
function to be called with both near and far calls in the same program. To
perform this translation, the linker takes a section of object code such as
CALL FAR _func
where func is defined in the current segment, and replaces it with the
following code:
PUSH CS
CALL NEAR _func
NOP
This substitution works because the linker has inserted PUSH CS to place a
far return address on the stack.
Use /FARCALLTRANSLATION with /PACKCODE.
The /FARCALLTRANSLATION option is most effective when used in conjunction
with the /PACKCODE option discussed in Section 1.6.2. Using the /PACKCODE
option causes far calls that were intersegment to become intrasegment calls.
The /FARCALLTRANSLATION feature can then take advantage of the new grouping
to translate all intrasegment far calls into near calls.
Benefits of /FARCALLTRANSLATION
The /FARCALLTRANSLATION option is of significant benefit to protected-mode
programs. Table 1.2 illustrates why.
Table 1.2 Processor Clock Cycles for Calling Sequence
╓┌───────────────────┌──────────────┌──────────┌──────────────┌──────────────╖
Cycles (Real Cycles
Mode) (Protected
Mode)
────────────────────────────────────────────────────────────────────────────
Instructions 286 386 286 386
────────────────────────────────────────────────────────────────────────────
Far Function Call
CALL FAR PTR _func 13 17 26 34
Total 13 17 26 34
────────────────────────────────────────────────────────────────────────────
Near Function Call
Cycles (Real Cycles
Mode) (Protected
Mode)
PUSH CS 3 2 3 2
CALL NEAR PTR 7 7 7 7
_func
NOP 3 3 3 3
Total 13 12 13 12
────────────────────────────────────────────────────────────────────────────
Savings 0 5 13 22
────────────────────────────────────────────────────────────────────────────
1.6.2 Packing Code (/PACKCODE)
The /PACKCODE linker option groups neighboring code segments together. When
used with the /F option, the /PACKCODE option greatly increases the number
of near calls that can be made to a function. This option can be followed
with a limit (expressed in bytes) at which to stop packing and to begin a
new group. Here is the syntax for the /PACKCODE option: ;/PACKCODE option
/PACKCODE:number
where number is an optional hexadecimal, octal, or decimal number that
specifies the limit for packing. The radix (octal, decimal, or hexadecimal)
is specified just as you would specify it to a C program.
Radix Rules for Specification
────────────────────────────────────────────────────────────────────────────
Octal Specify the octal number with a leading
0. You can only use the digits 0 through
7 in an octal number. For example, 07777.
Decimal Specify the decimal number without a
leading 0. For example, 65530.
Hexadecimal Specify the hexadecimal number with a
leading 0x. For example, 0x3FFF.
If you omit the packing limit, the linker supplies a default value of 65,
530.
The abbreviation for the /PACKCODE option is /PACKC.
1.6.3 Packing Data (/PACKDATA)
The /PACKDATA option is analogous to the /PACKCODE option, except that it
groups together neighboring data segments instead of code segments. This
option is most useful when you have a large-model program that exceeds the
OS/2 limitation of 255 segments. By using /PACKDATA, you can group segments,
thereby reducing the total number OS/2 has to manage. Here is the syntax for
the /PACKDATA option:
/PACKDATA:number
where number is an optional hexadecimal, octal, or decimal number that
specifies the limit for packing. The radix (hexadecimal, octal, or decimal)
is specified just as you would specify it to a C program. For more
information on specifying hexadecimal, octal, or decimal numbers, see
Section 1.6.2 above.
If the packing limit is omitted, the linker supplies a default value of
65,535 (0xFFFF).
The abbreviation for the /PACKDATA option is /PACKD.
1.6.4 Packing the Executable File (/EXEPACK)
The executable file created by the compiler often contains sequences of
repeated bytes. You can remove these repeated sequences with the /EXEPACK
option. This decreases the size of the resulting executable file as well as
program load time.
────────────────────────────────────────────────────────────────────────────
WARNING
Because the /EXEPACK option removes debug information from the executable
file, you should not use it with the /CODEVIEW option.
────────────────────────────────────────────────────────────────────────────
1.7 Optimizing in Different Environments
The environment in which you plan to use a program can have a bearing on the
types of optimizations that you should use.
1.7.1 Optimizing in DOS
You need not take special precautions for programs written under DOS unless
you are writing a terminate-and-stay-resident (TSR) program. If an
interrupt-driven routine could modify a memory location in a program, you
should declare that variable volatile.
1.7.2 Optimizing in OS/2
Many of the rules for interrupt routines apply to OS/2. If one thread can
modify variables in another thread, declare these variables as volatile.
1.7.3 Optimizing in Microsoft Windows(tm)
Microsoft Windows(tm) can move segments dynamically. As a result of dynamic
heap compaction, pointers maintained in registers can be invalidated. The
/Ow option instructs the compiler that you will not be using aliases, but
that Windows might cause certain optimizations to be unsafe across function
calls.
If you are not using any aliases you must still use the /Ow option with
Windows programs. See Section 1.5.3, "Assuming No Aliasing (/Oa and /Ow),"
for more information.
1.8 Choosing Function-Calling Conventions
In Microsoft C, version 6.0, functions can call other functions using three
different conventions. Note that, while no calling convention has been
defined as "standard," most C compilers use conventions similar to those
described here. The C calling convention requires the most object code to
set up, but it is the only calling convention that supports functions with
variable-length argument lists. The FORTRAN/Pascal calling convention is
more compact, but does not allow for variable-length argument lists. The
_fastcall, or register calling convention is the fastest of the three
calling conventions, but it does not support variable-length argument lists
or mixed-language program interfaces.
1.8.1 The C Calling Convention (/Gd)
Because C allows functions to have a variable number of parameters,
parameters must be pushed onto the stack from right to left. (If parameters
were pushed from left to right, it would be difficult for the compiler to
determine which parameter was first.) If you do not specify command-line
options that modify the function-calling convention, the C calling
convention is used; otherwise, the _cdecl keyword must be used before any
function using the C calling convention.
If, for example, you use the /Gr (register calling convention) option when
you compile, and the function add_two must have the C calling convention,
declare add_two as follows:
int _cdecl add_two( int x, int y );
1.8.2 The FORTRAN/Pascal Calling Convention (/Gc)
Use the FORTRAN/Pascal calling convention for any functions declared with
either the _fortran or _pascal keywords. (The two keywords currently produce
identical results.) Parameters to these functions are always pushed on the
stack from left to right. While any function can be declared with the
FORTRAN/ Pascal convention, it is used primarily for prototypes to Pascal or
FORTRAN routines called from within C programs. This calling convention can
also produce smaller, faster programs.
The /Gc option (generate Pascal-style function calls) can be used to make
all functions in a file observe the FORTRAN/Pascal calling convention.
Note that C run-time library routines must still be called using C calling
conventions. Because these routines are declared using the _cdecl keyword
header files, you must include the appropriate header files in any program
using run-time library routines.
Functions with variable-length parameter lists (such as printf) cannot use
the FORTRAN/Pascal calling convention.
────────────────────────────────────────────────────────────────────────────
NOTE
The /ML, /MD, and /MT options cause all floating-point functions to be
declared as FORTRAN/Pascal. See Chapter 16, "Dynamic Linking with OS/2," for
more information.
────────────────────────────────────────────────────────────────────────────
1.8.3 The Register Calling Convention (/Gr)
You can decrease execution time if parameters to functions are passed in
registers rather than on the stack. Compiling with the /Gr command-line
option enables the register calling convention for an entire file. The
_fastcall keyword enables the register calling convention on a
function-by-function basis.
Because the 80x86 processor has a limited number of registers, only the
first three parameters are allocated to registers; the rest are passed using
the FORTRAN/Pascal calling convention. The register calling convention can
increase the speed of a program.
────────────────────────────────────────────────────────────────────────────
NOTE
The compiler allocates different registers for variables declared as
register and for passing arguments using the register calling convention.
This calling convention will not conflict with any register variables that
you may have declared.
────────────────────────────────────────────────────────────────────────────
Exercise caution when using the register calling convention for any function
written in in-line assembly language. Your use of registers in
assembly-language could conflict with the compiler's use of registers for
storing parameters.
1.8.4 The _fastcall Calling Convention
This section describes the details of the _fastcall calling convention. The
information is for the use of assembly-language programmers who are
interested in using either the in-line assembler or the Microsoft Macro
Assembler (MASM) to write functions declared as _fastcall. Functions
declared as _fastcall accept arguments in registers rather than on the
stack; functions declared as _cdecl or _pascal accept parameters only on the
stack.
────────────────────────────────────────────────────────────────────────────
WARNING
The register usage documented here applies only to Microsoft C, version 6.0.
It may change in future releases of the compiler.
────────────────────────────────────────────────────────────────────────────
Argument-Passing Convention
The _fastcall calling convention is a "strongly typed" register calling
convention. This typing allows the compiler to generate better code by
passing arguments in registers that correspond to the data type you are
passing. Because the compiler chooses registers depending on the type of the
argument and not in a strict linear order, the calling program and called
function must agree on the types of the arguments in order to communicate
data correctly.
For each type of argument there is a list of register candidates. The
arguments are allocated to registers or, if no suitable register remains
unused, are pushed onto the stack left-to-right. Each argument is put in the
first register candidate that does not already contain an argument. Table
1.3 shows the basic types and the register candidate list for each.
Table 1.3 Register Candidates
╓┌────────────────────────────────┌──────────────────────────────────────────╖
Type Register Candidates
Type Register Candidates
────────────────────────────────────────────────────────────────────────────
character AL, DL, BL
unsigned character AL, DL, BL
integer AX, DX, BX
unsigned integer AX, DX, BX
long integer DX:AX
unsigned long integer DX:AX
near pointer BX, AX, DX
far or huge pointer passed on the stack
────────────────────────────────────────────────────────────────────────────
All far and huge pointers are pushed on the stack, as are all structures,
unions, and floating-point types.
Return Value Convention
The _fastcall return value convention is based on the size of the return
value, except with floating-point types. All floating point types are
returned on the top of the NDP stack. For more information about the NDP
stack and returning floating-point values, see Chapter 4, "Controlling
Floating-Point Math Operations." The following list shows how values 4 bytes
or smaller, including unions and structures, are returned from a _fastcall
function.
Size Return Convention
────────────────────────────────────────────────────────────────────────────
1 Byte AL Register
2 Bytes AX Register
4 Bytes DX, AX Registers (for pointers, the
segment is returned in DX, the offset in
AX; for long integers,
the most-significant byte is returned in
DX, leastsignificant byte in AX)
Note that the protocol for returning values 4 bytes or smaller is the same
as for functions declared as _cdecl. To return structures and unions larger
than 4 bytes, the calling program passes a hidden parameter as the last item
pushed. This parameter is a near pointer, implicitly SS-relative, to a
buffer in which the value is to be returned. A far pointer to
SS:hidden-param must be returned in DX:AX. This is the same convention for
returning structures as _pascal.
Stack Adjustment Convention
Unlike functions declared as _cdecl, functions declared as _fastcall must
pop the arguments off the stack. The calling program does not adjust the
stack after function return.
Register Preservation Requirement
All functions must preserve the DS, BP, SI, and DI registers. Your _fastcall
function can modify the values in AX, BX, CX, DX, and ES.
Function-Naming Convention
The public name put into the object file for a function declared as
_fastcall is the name given by the user with a leading "at sign" (@). No
case translation is performed on the function name. The function declaration
int _fastcall FCFunc( void );
causes the compiler to place the public symbol @FCFunc in your object file
at every location FCFunc is referenced in your program.
If you do not declare the function as _fastcall in your C program, the
compiler assumes the default calling convention. The default is usually the
C calling convention but can be changed by the /Gc (Pascal Calling
Convention), /Gr (Register Calling Convention), or /Gd (C Calling
Convention) options. If the linker gives you an unresolved external
reference, you may have failed to declare an external _fastcall function
properly. For more information about calling conventions, see Chapter 12,
"Programming with Mixed Languages."
Chapter 2 Managing Memory
────────────────────────────────────────────────────────────────────────────
When you develop advanced applications in Microsoft C, you must pay
attention to memory management─that is, how data and code are stored and
accessed in memory. A well-thought-out memory strategy will make your
programs run faster and occupy less memory.
You can follow one or more of these memory management strategies:
■ Choose a standard memory model.
■ Create a mixed-model program with the _near, _far, _huge, and _based
keywords.
■ Create your own customized memory model.
■ Allocate memory as you need it with the malloc family of functions.
This chapter explains pointers, memory models (including the new tiny
model), variations such as custom memory models and mixed models, and based
pointers.
2.1 Pointer Sizes
One of the strengths of the C language is that it allows you to use pointers
to directly access memory locations.
Every Microsoft C program has at least two parts: the code (function
definitions) and the data (variables and constants). As a program runs, it
refers to elements of the code or the data by their addresses. These
addresses can be stored in pointer variables.
Pointer variables can fit into 16 bits or 32 bits, depending on the distance
of the object to which they refer.
2.1.1 Pointers and 64K Segments
IBM personal computers and compatibles use the Intel(R) 8086, 80186, 80286,
or 80386 processors (collectively called the 80x86 family). These processors
have a "segmented" architecture, which means they all have a mode that
treats memory as a series of segments, each of which occupies up to 64K of
memory. An offset from the base of the segment allows you to access
information within a given segment. Moving to a new segment requires
additional machine code.
A 16-bit pointer can address up to 65,536 locations.
The 64K limit is necessary because the 80x86 registers are 16 bits (2 bytes)
wide. A single register can address only 65,536 (64K) unique memory
locations.
A pointer variable that fully specifies a memory address needs 16 bits for
the segment location and another 16 bits for the offset within the segment,
a total of 32 bits. However, if you have several variables in the same
general area, your program can set the segment register once and treat the
pointers as smaller 16-bit quantities.
The 80x86 register CS holds the base for the code segment; the register DS
holds the base for the data segment. Two other segment registers are
available: the stack segment register (SS) and the extra segment register
(ES). (The 80386 has additional segment registers: FS and GS.)
2.1.2 Near Pointers
If you don't explicitly specify a memory model, Microsoft C defaults to the
small model, which allots up to 64K for the code and another 64K for the
data (see Figure 2.1).
(This figure may be found in the printed book.)
When a small-model program runs, the CS and DS segment registers never
change. All code pointers and all data pointers contain 16 bits because they
remain within the 64K range.
These 16-bit pointers to objects within a single 64K segment are called
"near pointers." Accessing a near object is called "near addressing."
2.1.3 Far Pointers
If your program needs more than 64K for code or data, at least some of the
pointers must specify the memory segment, which means these pointers occupy
32 bits instead of 16 bits.
These larger 32-bit pointers that can point anywhere in memory are called
"far pointers." Accessing a far object is called "far addressing."
Far pointers can address any location, but they are bigger and slower.
Far addressing has the advantage that your program can address any available
memory location─up to 640K in DOS or several megabytes in OS/2. The
disadvantages of the larger far pointers is that they take up more memory
(four bytes instead of two) and that any use of the pointers (assigning,
modifying, or otherwise accessing values) takes more time.
Allowing either code or data to expand beyond 64K makes your programs larger
and slower.
2.1.4 Huge Pointers
A third type of pointer in Microsoft C is the "huge" pointer, which applies
only to data pointers. Code pointers cannot be declared as huge.
A huge address is similar to a far address in that both contain 32 bits,
made up of a segment value and an offset value. They differ only in the way
pointer arithmetic is performed.
For far pointers, Microsoft C assumes that code and data objects lie
completely within the segment in which they start, so pointer arithmetic
operates only on the offset portion of the address. Limiting the size of any
single item to 64K makes pointer arithmetic faster.
Huge pointers overcome this size limitation; pointer arithmetic is performed
on all 32 bits of the data item's address, thus allowing data items
referenced by huge pointers to span more than one segment. In this code
fragment,
int _huge *hp;
int _far *fp;
.
.
.
hp++;
fp++;
both hp and fp are incremented. The huge pointer is incremented as a
32-bit value that represents the combined segment and offset. Only the
offset part of the far pointer (a 16-bit value) is incremented.
Extending the size of pointer arithmetic from 16 to 32 bits causes such
arithmetic to execute more slowly. You gain the use of larger arrays by
paying a price in execution speed.
2.1.5 Based Addressing
When you declare near, far, and huge variables, the Microsoft C compiler and
linker automatically manage details such as allocating memory and keeping
track of segments.
A "based pointer" is a fourth kind of pointer that operates as a 16-bit
offset from a base that you specify. In this respect, based addressing
differs from near, far, or huge addressing; you're responsible for naming
the base, instead of letting the compiler decide.
Based pointers are new to version 6.0 of Microsoft C. They are explained in
more detail in Section 2.5, "Using Based Variables."
2.2 Selecting a Standard Memory Model
If you want to choose one size for all pointers, there's no need to declare
each variable as near or far. Instead, you select a standard memory model
and your choice applies to all variables in the program.
One advantage of using standard memory models is simplicity. You specify the
way the compiler allocates storage for code and data only once.
A standard memory model assumes all pointers are the same size.
Another advantage is that the standard memory models do not require the use
of Microsoft-specific keywords such as _near and _far, so they are best for
writing code that is portable to other (non-DOS) systems.
The disadvantage of standard memory models is that, because they make global
assumptions about the environment, they do not always produce the most
efficient code.
2.2.1 The Six Standard Memory Models
The six Microsoft C memory models are shown in Table 2.1.
Table 2.1 Memory Models
╓┌─────────┌─────────────────────┌──────────┌────────────────────────────────╖
Maximum Total Memory
Model Code Data Data Arrays
────────────────────────────────────────────────────────────────────────────
Maximum Total Memory
Model Code Data Data Arrays
────────────────────────────────────────────────────────────────────────────
Tiny <64K <64K <64K
Small 64K 64K 64K
Medium No limit 64K 64K
Compact 64K No limit 64K
Large No limit No limit 64K
Huge No limit No limit No limit
────────────────────────────────────────────────────────────────────────────
The SETUP program creates the libraries that support the six standard memory
models.
When you choose one of the standard memory models, the compiler inserts the
name of the corresponding C run-time library in the object file so the
linker chooses it automatically. Each memory model has its own library,
except for the huge memory model (which uses the large-model library) and
the tiny model (which uses the small-model library).
2.2.2 Limitations on Code Size and Data Size
When writing a program in Microsoft C, keep in mind two limitations that
apply to all six memory models:
■ No single source module can generate 64K or more of code. You must
break large programs into modules and link their individual .OBJ files
to create the .EXE file.
■ No single data item can exceed 64K unless it appears in a huge-model
program or it has been declared with the _huge keyword.
2.2.3 The Tiny Memory Model
The tiny memory model is new to Microsoft C. It resembles the small model
with three exceptions:
■ The tiny model cannot exceed 64K per program (including both code and
data). A small-model program, on the other hand, can occupy up to
128K: 64K for code and 64K for data.
■ The tiny model produces .COM, rather than .EXE, files. To produce .COM
files, compile with the /AT option. Then link with the / TINY option
and link in CRTCOM.OBJ.
■ The tiny model applies to DOS only; it is not available in OS/2.
Although the tiny model imposes the most severe limits on code and data
size, it produces the smallest programs. The tiny memory model only offers a
load-time speed advantage over the small model; they both produce the
fastest programs.
2.2.4 The Huge Memory Model
The huge memory model is nearly identical to the large model. The only
difference is that the huge model permits individual arrays to exceed 64K in
size. For example, an int uses two bytes, so an array of 40,000 integers,
occupying 80,000 bytes of memory, would be permitted in the huge model. All
other models limit each array, structure, or other data object to no more
than 64K.
────────────────────────────────────────────────────────────────────────────
NOTE
Automatic arrays cannot be declared huge. Only static arrays and arrays
occupying memory allocated by the halloc function can be huge.
────────────────────────────────────────────────────────────────────────────
The huge model lifts the limits on arrays.
Although the huge model lifts the limits on arrays, some size restrictions
do apply. To maintain efficient addressing, no individual array element is
allowed to cross a segment boundary. This has the following implications:
■ No single element of an array can be larger than 64K. An array can be
larger than 64K, but its individual elements cannot.
■ For any array larger than 128K, all elements must have a size in bytes
equal to a power of 2: 2 bytes, 4 bytes, 8 bytes, 16 bytes, and so on.
If the array is 128K or smaller, its elements can be any size, up to
and including 64K.
Pointer arithmetic changes within the huge model, as well. In particular,
the sizeof operator may return an incorrect value. The ANSI draft standard
for C defines the value returned by sizeof to be of type size_t (which, in
Microsoft C, is
an unsigned int). The size in bytes of a huge array is an unsigned long
value, however. To find the correct value, you must use a type cast:
(unsigned long)sizeof(monster_array)
Similarly, the C language defines the result of subtracting two pointers as
ptrdiff_t (a signed int in Microsoft C). Subtracting two huge pointers will
yield a long value. Microsoft C gives the correct result with the following
type cast:
(long)(ptr1_huge - ptr2_huge)
When you select huge model, all extern arrays are treated as _huge.
Operations on data declared as _huge can be less efficient than the same
operations on data declared as _far.
2.2.5 Null Pointers
Within the medium and compact models, code pointers and data pointers differ
in size: one is 16 bits wide and the other is 32 bits wide. When using these
memory models, you should be careful in your use of the manifest constant
NULL.
NULL represents a null data pointer. The C include files define it as
#define NULL ((void *) 0)
There can be problems in models with different sizes of code and data
pointers.
In memory models where data pointers have the same size as code pointers,
the actual size of a null pointer doesn't matter. In memory models where
code and data pointers are different sizes, problems can occur. Consider
this example:
void main()
{
func1( NULL );
func2( NULL );
}
func1( char *dp )
{
.
.
.
}
func2( char (*fp)( void ) )
{
.
.
.
}
In the absence of function prototypes for func1 and func2, the compiler
always assumes that NULL refers to data and not code.
The example above works correctly in tiny, small, large, and huge models
because, in those models, a data pointer is the same size as a code pointer.
Under medium or compact model, however, main passes NULL to func2 as a
null data pointer rather than as a null code pointer (a pointer to a
function), which means the pointer is the wrong size.
To ensure that your code works properly in all models, declare each function
with a prototype. For example, before main, include these two lines:
int func1( char *dp );
int func2( char (*fp)( void ));
If you add these prototypes to the example, the code works properly in all
memory models. Prototypes force the compiler to coerce code pointers to the
correct size. Prototypes also enable strong type-checking of parameters.
2.2.6 Specifying a Memory Model
If you do not specify a memory model, Microsoft C defaults to the small
model, which is adequate for many small to mid-sized programs.
You can select a memory model from the Programmer's WorkBench or from the
command line.
Selecting from within PWB
If you're compiling from the Programmer's WorkBench, open the Options menu
and choose C Global Build Options. The available memory models appear in the
upper left corner. Choose one of the six standard models or choose
Customized and type in the options for a customized model.
Selecting from the Command Line
You can choose a memory model by including an option on the command line.
For example, to compile CLICK.C as a compact-model program, type this:
CL /AC CLICK.C
The /AC option selects the compact memory model. The six options and four
libraries are listed below:
Option Memory Model: Library
────────────────────────────────────────────────────────────────────────────
/AT Tiny Model: SLIBCxx.LIB (plus CRTCOM.OBJ)
/AS Small Model: SLIBCxx.LIB
/AM Medium Model: MLIBCxx.LIB
/AC Compact Model: CLIBCxx.LIB
/AL Large Model: LLIBCxx.LIB
/AH Huge Model: LLIBCxx.LIB
2.3 Mixing Memory Models
In standard memory models, explained above, all data pointers are the same
size and all code pointers are the same size.
A mixed memory model selectively combines different types of pointers within
the same program. A mixed model extends the limits of a given memory model
while retaining its benefits.
A mixed memory model lets you mix near and far pointers.
For example, imagine a programming situation where you add an array to a
small-model program, pushing the data segment past the 64K limit.
You could solve the problem by moving up from the small to the compact
memory model. Doing so would bump all data pointers from two to four bytes.
The .EXE file would grow accordingly. Execution time would slow.
A second and perhaps better solution is to stay within the standard small
memory model, which uses near pointers, but to declare the new array as far.
You mix near pointers and far pointers, creating a mixed model.
Microsoft C lets you override the standard addressing convention for a given
memory model by specifying that certain items are _near, _far, _huge, or
_based. These keywords are not a standard part of the C language; they are
Microsoft extensions, meaningful only on systems that use 80x86
microprocessors. Using these keywords may affect the portability of your
code.
────────────────────────────────────────────────────────────────────────────
NOTE
Previous versions of the Microsoft C Compiler accepted the keywords near,
far, and huge without an initial underscore. Since the ANSI draft standard
for C permits compiler implementors to reserve identifiers that begin with
underscores, an underscore was added to these keywords to mark them as
Microsoft-specific. To maintain compatibility with existing source code, the
compiler still recognizes the obsolescent versions of these keywords.
────────────────────────────────────────────────────────────────────────────
You can compile a program in the small model, for example, but declare a
certain array to be _far. At run time, the address of that array occupies
four bytes. The program may slow slightly when accessing items in that
particular far array, but throughout the rest of the program, all addressing
would be near. Note that all pointers to elements of an array declared as
_far must also be declared as _far.
Table 2.2 lists the effects of these keywords on data pointers, code
pointers, and pointer arithmetic.
Table 2.2 Addressing Declared with Microsoft Keywords
╓┌────────┌─────────────────────┌─────────────────────┌──────────────────────╖
Keyword Data Code Arithmetic
────────────────────────────────────────────────────────────────────────────
_near Data reside in Functions reside in 16 bits
default data current code
segment; 16-bit segment; 16-bit
addresses addresses
Keyword Data Code Arithmetic
────────────────────────────────────────────────────────────────────────────
addresses addresses
_far Data can be anywhere Functions can be 16 bits
in memory, not called from anywhere
necessarily in the in memory; 32-bit
default data addresses
segment; 32-bit
addresses
_huge Data can be anywhere Not applicable; 32 bits
in memory, not code cannot be (data only)
necessarily in the declared _huge
default data segment.
Individual data
items (arrays) can
exceed 64K in size;
32-bit addresses
_based Data can be anywhere Not applicable; 16 bits
Keyword Data Code Arithmetic
────────────────────────────────────────────────────────────────────────────
_based Data can be anywhere Not applicable; 16 bits
in memory, not code cannot be (data only)
necessarily in the declared _based
default data
segment; 16-bit
addresses plus a
known base provide
the range of 32-bit
addresses
────────────────────────────────────────────────────────────────────────────
2.3.1 Pointer Problems
When you declare items to be _near, _far, _huge, or _based, you can link
with a standard run-time library. Be aware, however, that in some cases, the
modified pointers will be incompatible with standard library functions.
Watch for these problems that affect pointers:
■ A library function that expects a 16-bit pointer as an argument will
not function properly with modified variables that occupy 32 bits. In
other words, you can cast a near pointer to a far pointer, because it
adds the segment value and maintains the integrity of the address. If
you cast a far pointer to near, however, the compiler generates a
warning message because the offset may not lie within the default data
segment, in which case the original far address is irretrievably
lost.
■ A library function that returns a pointer will return a pointer of the
default size for the memory model. This is only a problem if you are
assigning the return value to a pointer of a smaller size. For
example, there may be difficulties if you compile with a model that
selects far data pointers, but you have explicitly declared the
variable to receive the return value _near.
This warning does not apply to all functions. See Section B.2.8 in
Appendix B for a list of model-independent string and memory functions
such as _fstrcat, the far version of strcat.
■ Based pointers pose a special problem. Based pointers are passed to
other functions as is (without normalization). Certain functions
expect to receive based pointers, but most do not. Therefore, in most
cases, you must either explicitly cast a based pointer to a far
pointer or make sure that all functions that receive based pointers
are prototyped.
Some run-time library functions support near, far, huge, and based
variables. For example, halloc allocates memory for a huge data array.
You can always pass the value (but not the address) of a far item to a
small-model library routine. For example,
/* Compile in small model */
#include <stdio.h>
long _far time_val;
void main()
{
time( &time_val ); /* Illegal far address */
printf( "%ld\n", time_val ); /* Legal value */
}
When you use a mixed memory model, you should include function prototypes
with argument-type lists to ensure that all pointer arguments are passed to
functions correctly.
2.3.2 Declaring Near, Far, Huge, and Based Variables
The _near, _far, _huge, and _based keywords modify either objects or
pointers to objects. When using them to declare variables, keep these rules
in mind:
■ The keyword always modifies the object or pointer immediately to its
right. In complex declarations, think of the _far keyword and the item
to its right as being a single unit. For example, in the case of the
declaration
char _far * _near *p;
p is a near pointer to a far pointer to char, which resides in the
default data segment for the memory model being used.
By contrast, the declaration
char _far * _near p;
is a far pointer to char that will always be stored in DGROUP,
regardless of the memory model being used.
■ If the item immediately to the right of the keyword is an identifier,
the keyword determines whether the item will be allocated in the
default data segment ( _near) or a separate data segment ( _far,
_huge, or _based). For example,
char _far a;
allocates a as an item of type char with a _far address.
■ If the item immediately to the right of the keyword is a pointer, the
keyword determines whether the pointer will hold a near address (16
bits), a based address (16 bits), a far address (32 bits), or a huge
address (also 32 bits). For example,
char _huge *p;
allocates p as a huge pointer (32 bits) to an item of type char. Any
arithmetic performed on the huge pointer p will affect all 32 bits.
That is, the instruction p++ increments the pointer as a 32-bit
entity.
2.3.3 Declaring Near and Far Functions
You cannot declare functions as _huge or _based. The rules for using the
_near and _far keywords for functions are similar to those for using them
with data:
■ The keyword always modifies the function or pointer immediately to its
right.
■ If the item immediately to the right of the keyword is a function, the
keyword determines whether the function will be allocated as near or
far. For example,
char _far fun();
defines fun as a function with a 32-bit address that returns a char.
The function may be located in near memory or far memory, but it is
called with the full 32-bit address. The _far keyword applies to the
function, not to the return type.
■ If the item immediately to the right of the keyword is a pointer to a
function, the keyword determines whether the function will be called
using a near (16-bit) or far (32-bit) address. For example,
char (_far *pfun)( );
defines pfun as a far pointer (32 bits) to a function returning type
char.
■ Function declarations must match function definitions.
■ The _huge and _based keywords do not apply to functions. That is, a
function cannot be huge (larger than 64K) or based. A function can
return a huge data pointer to the calling function. A function can
return a based pointer unless it is a pointer based on _self (see
Section 2.5.2, "Declaring Based Variables").
The example below declares fun1 as a far function returning type char:
char _far fun1(void); /* small model */
char _far fun(void)
{
.
.
.
}
Here, the fun2 function is a near function that returns a far pointer to
type char:
char _far * _near fun2( ); /* large model */
char _far * _near fun( )
{
.
.
.
}
The example below declares pfun as a far pointer to a function that has an
int return type, assigns the address of printf to pfun, and prints "Hello
world." twice.
/* Compile in medium, large, or huge model */
#include <stdio.h>
int (_far *pfun)( char *, ... );
void main()
{
pfun = printf;
pfun( "Hello world.\n" );
(*pfun)( "Hello world.\n" );
}
2.3.4 Pointer Conversions
Passing near or far pointers as arguments to functions can cause automatic
conversions in the size of the pointer argument. Passing a pointer to an
unprototyped function forces the pointer size to the larger of the following
two sizes:
■ The default pointer size for that type, as defined by the memory model
selected during compilation.
For example, in medium-model programs, data pointer arguments are near
by default, and code pointer arguments are far by default.
■ The size of the type of the argument.
Note that if you supply a based pointer as an argument to a function and do
not specifically cast it to a far pointer type, a 16-bit offset from the
base segment is passed.
Function prototypes prevent problems that may occur in mixed memory models.
If you provide a function prototype with complete argument types, the
compiler performs type-checking and enforces the conversion of actual
arguments to the declared type of the corresponding formal argument.
However, if no declaration is present or the argument-type list is empty,
the compiler will convert nonbased pointer arguments automatically to the
default type or the type of the argument, whichever is larger. To avoid
mismatched arguments, always use a prototype with the argument types.
For example, the following program produces unexpected results in
compact-model, large-model, or huge-model programs.
void main( )
{
int _near *x;
char _far *y;
int z = 1;
test_fun( x, y, z ); /* x is coerced to far
pointer in compact,
large, or huge model */
}
int test_fun( int _near *ptr1, char _far *ptr2, int a)
{
printf("Value of a = %d\n", a);
}
If the preceding example is compiled as a tiny, small, or medium program,
the size of x is 16 bits, the size of y is 32 bits, and the value
printed for a is 1.
However, if the example is compiled in compact, large, or huge model, both
x and y are automatically converted to far pointers when they are passed
to test_fun. Since ptr1, the first parameter of test_fun, is defined as a
near pointer argument, it takes only 16 bits of the 32 bits passed to it.
The next parameter, ptr2, takes the remaining 16 bits passed to ptr1, plus
16 bits of the 32 bits passed to it. Finally, the third parameter, a, takes
the leftover 16 bits from ptr2, instead of the value of z in the main
function.
This shifting process does not generate an error message, because both the
function call and the function definition are legal. In this case the
program does not work as intended, however, since the value assigned to a
is not the value intended.
To pass ptr1 as a near pointer, you should include a function prototype
that specifically declares this argument for test_fun as a near pointer,
as shown below:
/* First, prototype test_fun so the compiler
* knows in advance about the near pointer argument:
*/
int test_fun (int _near*, char _far *, int);
main ( )
{
int _near *x;
char _far *y;
int z = 1;
test_fun ( x, y, z ); /* now, x is not coerced
* to a far pointer; it is
* passed as a near pointer,
* no matter which memory
* model is used
*/
}
int test_fun ( int _near *ptr1, char _far *ptr2, int a)
{
printf ( "Value of a = %d\n", a );
}
2.4 Customizing Memory Models
A third way to manage memory is to combine different features from standard
memory models to create your own customized memory model. You should have a
thorough understanding of C memory models and the architecture of 80x86
processors before creating your own nonstandard memory models.
In a customized model, you select the size of code pointers and data
pointers.
The /Astring option lets you change the attributes of the standard memory
models to create your own memory models. The three letters in string
correspond to the code pointer size, the data pointer size, and the stack
and data segment setup, respectively. Because the letter allowed in each
field is unique to that field, you can give the letters in any order after
/A. All three letters must be present.
The standard memory-model options (/AT, /AS, /AM, /AC, /AL, and /AH) can be
specified in the /Astring form. As an example of how to construct memory
models, the standard memory-model options are listed below with their
/Astring equivalents:
Standard Custom Equivalent
────────────────────────────────────────────────────────────────────────────
/AT /Asnd
/AS /Asnd
/AM /Alnd
/AC /Asfd
/AL /Alfd
/AH /Alhd
For example, you might want to create a huge-compact model. This model would
allow huge data items but only one code segment. The option for specifying
this model would be /Ashd.
────────────────────────────────────────────────────────────────────────────
NOTE
Tiny model is identical to small model except that it causes the linker to
search for CRTCOM.LIB. The executable file generated when you specify tiny
model is a .COM file rather than a .EXE.
────────────────────────────────────────────────────────────────────────────
2.4.1 Setting a Size for Code Pointers
Within a custom memory model, you choose whether code pointers are short or
long:
Option Size
────────────────────────────────────────────────────────────────────────────
/Asxx Short (near) code pointers
/Alxx Long (far) code pointers
The /As (short) option tells the compiler to generate near 16-bit pointers
and addresses for all functions. This is the default for tiny-, small-, and
compact-model programs.
The /Al (long) option means that far 32-bit pointers and addresses are used
to address all functions. Far pointers are the default for medium-, large-,
and huge-model programs.
2.4.2 Setting a Size for Data Pointers
Data pointers can be near, far, or huge:
Option Size
────────────────────────────────────────────────────────────────────────────
/Axnx Near data pointers
/Axfx Far data pointers
/Axhx Huge data pointers
The /An (near) option tells the compiler to use 16-bit pointers and
addresses for all data. This is the default for tiny-, small-, and
medium-model programs.
The /Af (far) option specifies that all data pointers and addresses are 32
bits. This is the default for compact- and large-model programs.
The /Ah (huge) option specifies that all data pointers and addresses are far
(32-bit) and that arrays are permitted to extend beyond a 64K segment. This
is the default for huge-model programs.
With far data pointers, no single data item can be larger than a segment
(64K) because address arithmetic is performed only on 16 bits (the offset
portion) of the address. When huge data pointers are used, individual data
items can be larger than a segment (64K) because address arithmetic is
performed on both the segment and the offset.
2.4.3 Setting Up Segments
Within a customized model, you can choose to make the stack segment (SS)
equal the data segment (DS), in which case they overlap:
Option Effect
────────────────────────────────────────────────────────────────────────────
/Axxd SS == DS
/A«xx»u SS != DS; DS reloaded on function entry
/A«xx»w SS != DS; DS not reloaded on function
entry
Segment Setup Option /Ad
The option /Ad tells the compiler that the segment addresses stored in the
SS and DS registers are equal. The stack segment and the default data
segment are combined into a single segment. This is the default for all
standard-model programs. In small- and medium-model programs, the stack plus
all data must occupy less than 64K; thus, any data item is accessed with
only a 16-bit offset from the segment address in the SS and DS registers.
In compact-, large-, and huge-model programs, initialized global and static
data are placed in the default data segment up to a certain threshold. The
address of this segment is stored in the DS and SS registers. All pointers
to data, including pointers to local data (the stack), are full 32-bit
addresses. This is important to remember when passing pointers as arguments
in multiple-segment programs. Although you may have more than 64K of total
data in these models, no more than 64K of data can occupy the default
segment. The /Gt and /ND options control allocation of items in the default
data segment if a program exceeds this limit.
Segment Setup Option /Au
The option /Au tells the compiler that the stack segment does not
necessarily coincide with the data segment. In addition, it adds the _loadds
attribute to all functions within a module, forcing the compiler to generate
code to load the DS register with the correct value prior to entering the
function body. Combine the /ND option with /Au to name data segments other
than the default. When /Au is combined with /ND, the address in the DS
register is saved upon entry to each function, and the new DS value for the
module in which the function was defined is loaded into the register. The
previous DS value is restored on exit from the function. Therefore, only one
data segment is accessible at any given time. The /ND option lets you
combine these segments into a single segment.
If a standard memory-model option precedes it on the command line, the /Au
option can be specified without any letters indicating data pointer or code
pointer sizes. The program uses a standard memory model, but different
segments are set up for the stack and data segments.
The /Au option is useful for OS/2 or Microsoft Windows dynamic-link
libraries (DLLs), since it forces DS to be loaded on entry to each function.
It is also useful for writing extensions to the Programmer's WorkBench. This
is a costly operation, however, so consider using the /Aw option.
Segment Setup Option /Aw
The option /Aw, like /Au, causes the compiler to assume that the stack
segment is separate from the data segment. The compiler does not
automatically load the DS register at each function entry point. The /Aw
option is useful in creating applications that interface with an operating
system or with a program running at the operating-system level. The
operating system or the program running under the operating system actually
receives the data intended for the application program and places that data
in a segment; then the operating system or program must load the DS register
with the segment address for the application program.
As with the /Au option, the /Aw option can be specified without data pointer
and code pointer letters if a standard memory-model option precedes it on
the command line. In such a case, the program uses the specified memory
model just as with /Au, but the DS register is not reloaded at each function
entry point.
Even though /Au and /Aw indicate that the stack may be in a separate
segment, the stack's size is still fixed at the default size unless this is
overridden with the /F compiler option or the /STACK linker option.
The /Aw option is useful for writing OS/2 and Microsoft Windows dynamic-link
libraries (DLLs), but care must be taken when it is used. Declare all entry
points to the dynamic-link library as _loadds to force DS to be loaded on
entry to the function (exactly like the /Au option). The other functions
will then be more efficient, though, because they will not have to perform
redundant loads of the DS register. For example,
_export _loadds _far pascal LibFunc( void )
{
.
.
.
HelperFunc(); }
HelperFunc( void )
{
.
.
.
}
The library entry point, LibFunc, is declared as _loadds to force the DS
register to be loaded on entry. The function HelperFunc, which is private
to the dynamic-link library, is declared as a normal C function. Since it
cannot be called from outside of the module, HelperFunc does not need to
reload DS.
If you choose one of the options that specifies that the stack segment is
not equal to the data segment (SS != DS), you cannot pass the address of
frame variables as arguments to functions that take near pointers. That is,
in tiny, small, and medium models, you cannot pass the address of a local
variable (which is allocated on the stack) as an argument, because the
receiving function will assume the pointer is relative to the data segment.
However, the receiving function could solve this problem by declaring the
pointer to be the following:
based(_segname("_STACK"))
Another solution would be to cast the pointer to a far pointer in both
locations as follows:
/* Call func with an explicit cast to far */
func( (char far *)frame_var );
.
.
.
void func( char far *formal_var )
2.4.4 Library Support for Customized Memory Models
Most C programs make function calls to the routines in the C run-time
library. When you write mixed-model programs, you are responsible for
determining which library (if any) is suitable for your program and for
ensuring that the appropriate library is linked. Table 2.3 shows the
libraries from which to extract the start-up routine for each customized
memory model.
Table 2.3 Start-Up Routines for Customized Memory Models
╓┌────────────────────────────────────────────────┌──────────────────────────╖
Memory-Model Option From Library
────────────────────────────────────────────────────────────────────────────
/Asnx; /AS plus /Ax SLIBCf.LIB
/Asfx; /Ashx; /AC plus /Ax CLIBCf.LIB
/Alnx; /AM plus /Ax MLIBCf.LIB
/Alfx; /Alhx; /AL plus /Ax; /AH plus /Ax LLIBCf.LIB
────────────────────────────────────────────────────────────────────────────
The /Ax option represents either /Au or /Aw. In the library names, f is
either E (emulator library), 7 (8087/80287 library), or A (alternate math
library).
2.4.5 Setting the Data Threshold
Option Effect
────────────────────────────────────────────────────────────────────────────
/Gt«number» Sets the threshold
The /Gt option causes all data items whose size is greater than to number
bytes to be allocated to a new data segment. When number is specified, it
must follow the /Gt option immediately, with no intervening spaces. When
number is omitted, the default threshold value is 256. When the /Gt option
is omitted, the default threshold value is 32,767.
The /Gt option applies only to compact-, large-, and huge-model programs,
since small- and medium-model programs have only one data segment. The
option is particularly useful with programs that have more than 64K of
initialized static and global data in small data items, because otherwise
you run out of memory in the default data segment and can't link the
program. The /Gt option has no effect on uninitialized global data.
2.4.6 Naming Modules and Segments
Option Effect
────────────────────────────────────────────────────────────────────────────
/NM modulename Names the module
/NT textsegment Names the code segment
/ND datasegment Names the data segment
"Module" is another name for an object file created by the C compiler from a
single source file. Every module has a name. The compiler uses this name in
error messages if problems are encountered during processing. The module
name is usually the same as the source-file name. You can change this name
using the /NM (name module) option. The new modulename can include any
combination of letters and digits. The space between /NM and modulename is
optional.
Every module has at least two segments: a code segment (sometimes called the
text segment) containing the program instructions, and a data segment
containing the program data.
The compiler normally creates the code and data segment names. The default
names depend on the memory model chosen for the program. For example, in
small-model programs the code segment is named _TEXT and the data segment is
named _DATA.
Table 2.4 summarizes the naming conventions for code and data segments.
Table 2.4 Segment-Naming Conventions
╓┌─────────┌─────────────┌───────┌───────────────────────────────────────────╖
Model Code Data Module
────────────────────────────────────────────────────────────────────────────
Tiny _TEXT _DATA ---
Small _TEXT _DATA ---
Medium module_TEXT _DATA filename
Model Code Data Module
────────────────────────────────────────────────────────────────────────────
Medium module_TEXT _DATA filename
Compact _TEXT _DATA filename
Large module_TEXT _DATA filename
Huge module_TEXT _DATA filename
────────────────────────────────────────────────────────────────────────────
In memory models that contain multiple data segments (compact, large, and
huge), _DATA is the name of the default data segment. Other data segments
have unique private names. You can override the default names with the
options /NT (name text) and /ND (name data).
The /ND option is commonly used to create and compile modules that contain
data only. Such modules can be accessed from other parts of the program by
declaring their variables as external.
If you change the name of the default data segment with /ND, your program
must load the DS register with the segment selector of your named data
segment before it accesses it. You must therefore compile your program
either with the /Astringform of the memory-model option and the /Au option
for the segment setup, or with the /A option for a s
Chapter 3 Using the In-Line Assembler
────────────────────────────────────────────────────────────────────────────
This chapter explains how to use the Microsoft C in-line assembler. Assembly
language serves many purposes, such as improving program speed, reducing
memory needs, and controlling hardware. The in-line assembler lets you embed
assembly-language instructions directly in your C source programs without
extra assembly and link steps. The in-line assembler is built into the
compiler─you don't need a separate assembler such as the Microsoft Macro
Assembler (MASM).
3.1 Advantages of In-Line Assembly
Because the in-line assembler doesn't require separate assembly and link
steps, it is more convenient than a separate assembler. In-line assembly
code can use any C variable or function name that is in scope, so it is easy
to integrate it with your program's C code. And because the assembly code
can be mixed in-line with C statements, it can do tasks that are cumbersome
or impossible in C alone.
The uses of in-line assembly include
■ Writing functions in assembly language
■ Spot-optimizing speed-critical sections of code
■ Calling DOS and BIOS routines with the INT instruction
■ Creating TSR (terminate-and-stay-resident) code or handler routines
that require knowledge of processor states
In-line assembly is a special-purpose tool. If you plan to transport an
application, you'll probably want to place machine-specific code in a
separate module. And because the in-line assembler doesn't support all of
MASM's macro and data directives, you may find it more convenient to use
MASM for such modules.
3.2 The _asm Keyword
The _asm keyword invokes the in-line assembler and can appear wherever a C
statement is legal. It cannot appear by itself. It must be followed by an
assembly instruction, a group of instructions enclosed in braces, or, at the
very least, an empty pair of braces. The term "_asm block" here refers to
any instruction or group of instructions, whether or not in braces.
Below is a simple _asm block enclosed in braces. (The code prints the "beep"
character, ASCII 7.)
_asm
{
mov ah, 2
mov dl, 7
int 21h
}
Alternatively, you can put _asm in front of each assembly instruction:
_asm mov ah, 2
_asm mov dl, 7
_asm int 21h
Since the _asm keyword is a statement separator, you can also put assembly
instructions on the same line:
_asm mov ah, 2 _asm mov dl, 7 _asm int 21h
Braces can prevent ambiguity and needless repetition.
All three examples generate the same code, but the first style─enclosing the
_asm block in braces─has some advantages. The braces clearly separate
assembly code from C code and avoid needless repetition of the _asm keyword.
Braces can also prevent ambiguities. If you want to put a C statement on the
same line as an _asm block, you must enclose the block in braces. Without
the braces, the compiler cannot tell where assembly code stops and C
statements begin. Finally, since the text in braces has the same format as
ordinary MASM text, you can easily cut and paste text from existing MASM
source files.
The braces enclosing an _asm block don't affect variable scope, as do braces
in C. You can also nest _asm blocks, but the nesting doesn't affect variable
scope.
3.3 Using Assembly Language in _asm Blocks
The in-line assembler has much in common with other assemblers. For example,
it accepts any expression that is legal in MASM, and it supports all 80286
and 80287 instructions. This section describes the use of assembly-language
features in _asm blocks.
Instruction Set
The in-line assembler supports the full instruction set of the Intel 80286
and 80287 processors. It does not recognize 80386- and 80387-specific
instructions. To use 80286 or 80287 instructions, compile with the /G2
option.
Expressions
In-line assembly code can use any MASM expression, that is, any combination
of operands and operators that evaluates to a single value or address.
Data Directives and Operators
Although an _asm block can reference C data types and objects, it cannot
define data objects with MASM directives or operators. Specifically, you
cannot use the definition directives DB, DW, DD, DQ, DT, and DF, or the
operators DUP or THIS. Nor are MASM structures and records available. The
in-line assembler doesn't accept the directives STRUC, RECORD, WIDTH, or
MASK.
EVEN and ALIGN Directives
While the in-line assembler doesn't support most MASM directives, it does
support EVEN and ALIGN. These directives put NOP (no operation) instructions
in the assembly code as needed to align labels to specific boundaries. This
makes instruction-fetch operations more efficient for some processors (not
including eight-bit processors such as the Intel 8088).
Macros
The in-line assembler is not a macro assembler. You cannot use MASM macro
directives (MACRO, REPT, IRC, IRP, and ENDM) or macro operators ( <>, !, &,
%, and .TYPE). An _asm block can use C preprocessor directives, however. See
Section 3.4, "Using C in _asm Blocks" for more information.
Segment References
You must refer to segments by register rather than by name (the segment name
_TEXT is invalid, for instance). Segment overrides must use the register
explicitly, as in ES:[BX].
Type and Variable Sizes
The LENGTH, SIZE, and TYPE operators have a limited meaning in in-line
assembly. They cannot be used at all with the DUP operator (because you
cannot define data with MASM directives or operators). But you can use them
to find the size of C variables or types:
■ The LENGTH operator can return the number of elements in an array. It
returns the value 1 for nonarray variables.
■ The SIZE operator can return the size of a C variable. A variable's
size is the product of its LENGTH and TYPE.
■ The TYPE operator can return the size of a C type or variable. If the
variable is an array, TYPE returns the size of a single element of the
array.
For instance, if your program has an eight-element int array,
int arr[8];
the following C and assembly expressions yield the size of arr and its
elements:
╓┌───────────┌──────────────────────────┌────────────────────────────────────╖
_asm C Size
────────────────────────────────────────────────────────────────────────────
LENGTH arr sizeof(ar)/sizeof(arr[0]) 8
SIZE arr sizeof (arr) 16
TYPE arr size14(arr[0]) 2
────────────────────────────────────────────────────────────────────────────
Comments
Instructions in an _asm block can use assembly-language comments:
_asm mov ax, offset buff ; Load address of buff
Because C macros expand into a single logical line, avoid using
assemblylanguage comments in macros (see Section 3.8, "Defining _asm Blocks
as C Macros"). An _asm block can also contain C-style comments, as noted
below.
The _emit Pseudoinstruction
The _emit pseudoinstruction is similar to the DB directive of MASM. It
allows you to define a single immediate byte at the current location in the
current text segment. However, _emit can define only one byte at a time, and
it can only define bytes in the text segment. It uses the same syntax as the
INT instruction.
One use for _emit is to define 80386-specific instructions, which the
in-line assembler does not support. The following fragment, for instance,
defines the 80386 CWDE instruction:
/* Assumes 16-bit mode */
#define cwde _asm _emit 0x66 _asm _emit 0x98
.
.
.
_asm {
cwde
}
Debugging and Listings
In-line assembly code can be debugged with CodeView.
Programs containing in-line assembly code can be debugged with the CodeView
debugger, assuming you compile with the /Zi option.
Within CodeView, you can set breakpoints on both C and assembly-language
lines. If you enable mixed assembly and C mode, you can display both the
source and disassembled form of the assembly code.
Note that putting multiple assembly instructions or C statements on one line
can hamper debugging with CodeView. In source mode, the CodeView debugger
lets you set breakpoints on a single line but not on individual statements
on the same line. The same principle applies to an _asm block defined as a C
macro, which expands to a single logical line.
If you create a mixed source and assembly listing with the /Fc compiler
option, the listing contains both the source and assembly forms of each
assemblylanguage line. Macros are not expanded in listings, but they are
expanded during compilation.
See Chapter 9, "Debugging C Programs with CodeView," for more information.
3.4 Using C in _asm Blocks
Because in-line assembly instructions can be mixed with C statements, they
can refer to C variables by name and use many other elements of C. An _asm
block can use the following C language elements:
■ Symbols, including labels and variable and function names
■ Constants, including symbolic constants and enum members
■ Macros and preprocessor directives
■ Comments (both /* */ and // )
■ Type names (wherever a MASM type would be legal)
■ typedef names, generally used with operators such as PTR and TYPE or
to specify structure or union members
Within an _asm block, you can specify integer constants with either C
notation or assembler radix notation (0x100 and 100h are equivalent, for
instance). This allows you to define (using #define) a constant in C, and
use it in both C and assembly portions of the program. You can also specify
constants in octal by preceding them with a 0. For example, 0777 specifies
an octal constant.
3.4.1 Using Operators
An _asm block cannot use C-specific operators, such as the operator.
However, operators shared by C and MASM, such as the * operator, are
interpreted as assembly-language operators. For instance, outside an _asm
block, square brackets ( [] ) are interpreted as enclosing array subscripts,
which C automatically scales to the size of an element in the array. Inside
an _asm block, they are seen as the MASM index operator, which yields an
unscaled byte offset from any data object or label (not just an array). The
following code illustrates the difference:
int array[10];
_asm mov array[6], bx ; Store BX at array+6 (not scaled)
array[6] = 0; /* Store 0 at array+12 (scaled) */
The first reference to array is not scaled, but the second is. Note that
you can use the TYPE operator to achieve scaling based on a constant. For
instance, the following statements are equivalent:
_asm mov array[6 * TYPE int], 0 ; Store 0 at array + 12
array[6] = 0; /* Store 0 at array + 12 */
3.4.2 Using C Symbols
An _asm block can refer to any C symbol in scope where the block appears. (C
symbols are variable names, function names, and labels─in other words, names
that aren't symbolic constants or enum members.)
A few restrictions apply to the use of C symbols:
■ Each assembly-language statement can contain only one C symbol.
Multiple symbols can appear in the same assembly instruction only with
LENGTH, TYPE, and SIZE expressions.
■ Functions referenced in an _asm block must be declared (prototyped)
earlier in the program. Otherwise, the compiler cannot distinguish
between function names and labels in the _asm block.
■ An _asm block cannot use any C symbols with the same spelling as MASM
reserved words (regardless of case). MASM reserved words include
instruction names such as PUSH and register names such as SI.
■ Structure and union tags are not recognized in _asm blocks.
3.4.3 Accessing C Data
A great convenience of in-line assembly is the ability to refer to C
variables by name. An _asm block can refer to any symbols─including variable
names─that are in scope where the block appears. For instance, if the C
variable var is in scope, the instruction
_asm mov ax, var
stores the value of var in AX.
If a structure or union member has a unique name, an _asm block can refer to
it using only the member name, without specifying the C variable or typedef
name before the period (.) operator. If the member name is not unique,
however, you must place a variable or typedef name immediately before the
period (.) operator. For instance, the following structure types share
same_name as their member name:
struct first_type
{
char *weasel;
int same_name;
};
struct second_type
{
int wonton;
long same_name;
};
If you declare variables with the types
struct first_type hal;
struct second_type oat;
all references to the member same_name must use the variable name, because
same_name is not unique. But the member weasel has a unique name, so you
can refer to it using only its member name:
_asm
{
mov bx, OFFSET hal
mov cx, [bx]hal.same_name ; Must use 'hal'
mov si, [bx].weasel ; Can omit 'hal'
}
Note that omitting the variable name is merely a coding convenience. The
same assembly instructions are generated whether or not it is present.
3.4.4 Writing Functions
If you write a function with in-line assembly code, it's a simple matter to
pass arguments to the function and return a value from it. The following
examples compare a function first written for a separate assembler and then
rewritten for the in-line assembler. The function, called power2, receives
two parameters, multiplying the first parameter by 2 to the power of the
second parameter. Written for a separate assembler, the function might look
like this:
; POWER.ASM
; Compute the power of an integer
;
PUBLIC _power2
_TEXT SEGMENT WORD PUBLIC 'CODE'
_power2 PROC
push bp ; Save BP
mov bp, sp ; Move SP into BP so we can refer
; to arguments on the stack
mov ax, [bp+4] ; Get first argument
mov cx, [bp+6] ; Get second argument
shl ax, cl ; AX = AX * ( 2 ^ CL )
pop bp ; Restore BP
ret ; Return with sum in AX
_power2 ENDP
_TEXT ENDS
END
Function arguments are usually passed on the stack.
Since it's written for a separate assembler, the function requires a
separate source file and assembly and link steps. C function arguments
usually are passed on the stack, so this version of the power2 function
accesses its arguments by their positions on the stack. (Note that the MODEL
directive, available in MASM and some other assemblers, also allows you to
access stack arguments and local stack variables by name.)
The POWER2.C program below writes the power2 function with in-line
assembly code:
/* POWER2.C */
#include <stdio.h>
int power2( int num, int power );
void main( void )
{
printf( "3 times 2 to the power of 5 is %d\n", \
power2( 3, 5) );
}
int power2( int num, int power )
{
_asm
{
mov ax, num ; Get first argument
mov cx, power ; Get second argument
shl ax, cl ; AX = AX * ( 2 to the power of CL )
}
/* Return with result in AX */
}
The in-line version of the power2 function refers to its arguments by name
and appears in the same source file as the rest of the program. This version
also requires fewer assembly instructions. Since C automatically preserves
BP, the _asm block doesn't need to do so. It can also dispense with the RET
instruction, since the C part of the function performs the return.
Because the in-line version of power2 doesn't execute a C return
statement, it causes a harmless warning if you compile at warning levels 2
or higher:
warning C4035: 'power2' : no return value
The function does return a value, but the compiler cannot tell that in the
absence of a return statement. Simply ignore the warning in this context.
3.5 Using and Preserving Registers
In general, you should not assume that a register will have a given value
when an _asm block begins. An _asm block inherits whatever register values
happen to result from the normal flow of control.
If you use the _fastcall calling convention, the compiler passes function
arguments in registers instead of the stack. This can create problems in
functions with _asm blocks, since a function has no way to tell which
parameter is in which register. If the function happens to receive a
parameter in AX and immediately stores something else in AX, the parameter
is lost. In addition, you must preserve the CX and ES registers in any
function declared with _fastcall.
Don't use the _fastcall calling convention for functions with _asm blocks.
To avoid such register conflicts, don't use the _fastcall convention for
functions that contain an _asm block. If you specify the _fastcall
convention globally with the /Gr compiler option, declare every function
containing an _asm block with _cdecl. (The _cdecl attribute tells the
compiler to use the normal C calling convention for that function.) If you
are not compiling with /Gr, avoid declaring the function with the _fastcall
attribute.
As you may have noticed in the POWER2.C example in Section 3.4.4, the
power2 function doesn't preserve the value in the AX register. When you
write a function in assembly language, you don't need to preserve the AX,
BX, CX, DX, ES, and flags registers. However, you should preserve any other
registers you use (DI, SI, DS, SS, SP, and BP).
────────────────────────────────────────────────────────────────────────────
WARNING
If your in-line assembly code changes the direction flag using the STD or
CLD instructions, you must restore the flag to its original value.
────────────────────────────────────────────────────────────────────────────
Functions return values in the AX and DX registers.
The POWER2.C example in Section 3.4.4 also shows that functions return
values in registers. This is true whether the function is written in
assembly language or in C.
If the return value is short (a char, int, or near pointer), it is stored in
AX. The POWER2.C example returned a value by terminating with the desired
value in AX.
If the return value is long, store the high word in DX and the low word in
AX. To return a longer value (such as a floating-point value), store the
value in memory and return a pointer to the value (in AX if near or in DX:AX
if far).
Assembly instructions that appear in-line with C statements are free to
alter the AX, BX, CX, and DX registers. C doesn't expect these registers to
be maintained between statements, so you don't need to preserve them. The
same is true of the SI and DI registers, with some exceptions (see Section
3.9, "Optimizing"). You should preserve the SP and BP registers unless you
have some reason to change them─to switch stacks, for instance.
3.6 Jumping to Labels
Like an ordinary C label, a label in an _asm block has scope throughout the
function in which it is defined (not only in the block). Both assembly
instructions and C goto statements can jump to labels inside or outside the
_asm block.
Labels in _asm blocks have function scope and are not case sensitive.
Unlike C labels, labels defined in _asm blocks are not case sensitive, even
when used in C statements. C labels are not case sensitive in an _asm block,
either. (Outside an _asm block, a C label is case sensitive as usual.) The
following do-nothing code shows all the permutations:
void func( void )
{
goto C_Dest; /* legal */
goto c_dest; /* error */
goto A_Dest; /* legal */
goto a_dest; /* legal */
_asm
{
jmp C_Dest ; legal
jmp c_dest ; legal
jmp A_Dest ; legal
jmp a_dest ; legal
a_dest: ; _asm label
}
C_Dest: /* C label */
return;
}
Don't use C library function names as labels in _asm blocks. For instance,
you might be tempted to use exit as a label,
jne exit
.
.
.
exit:
; More _asm code follows
forgetting that exit is the name of a C library function. The code doesn't
cause a compiler error, but it might cause a jump to the exit function
instead of the desired location.
As in MASM programs, the dollar symbol ($) serves as the current location
counter─a label for the instruction currently being assembled. In _asm
blocks, its main use is to make long conditional jumps:
jne $+5 ; next instruction is 5 bytes long
jmp farlabel
; $+5
.
.
.
farlabel:
3.7 Calling C Functions
An _asm block can call C functions, including C library routines. The
following example calls the printf library routine:
#include <stdio.h>
char format[] = "%s %s\n";
char hello[] = "Hello";
char world[] = "world";
void main( void )
{
_asm
{
mov ax, offset world
push ax
mov ax, offset hello
push ax
mov ax, offset format
push ax
call printf
}
}
Since function arguments are passed on the stack, you simply push the needed
arguments─string pointers, in the example above─before calling the function.
The arguments are pushed in reverse order, so they come off the stack in the
desired order. To emulate the C statement
printf( format, hello, world );
the example pushes pointers to world, hello, and format, in that order,
then calls printf.
3.8 Defining _asm Blocks as C Macros
C macros offer a convenient way to insert assembly code into C code, but
they demand extra care because a macro expands into a single logical line.
To create trouble-free macros, follow these rules:
■ Enclose the _asm block in braces.
■ Put the _asm keyword in front of each assembly instruction.
■ Use old-style C comments ( /* comment */ ) instead of assembly-style
comments ( ; comment ) or single-line C comments ( // comment ).
To illustrate, the following example defines a simple macro:
#define BEEP _asm \
/* Beep sound */ \
{ \
_asm mov ah, 2 \
_asm mov dl, 7 \
_asm int 21h \
}
At first glance, the last three _asm keywords seem superfluous. They are
needed, however, because the macro expands into a single line:
_asm /* Beep sound */ { _asm mov ah, 2 _asm mov dl, 7 _asm int 21h }
The third and fourth _asm keywords are needed as statement separators. The
only statement separators recognized in _asm blocks are the newline
character and _asm keyword. And since a block defined as a macro is one
logical line, you must separate each instruction with _asm.
The braces are essential as well. If you omit them, the compiler can be
confused by C statements on the same line to the right of the macro
invocation. Without the closing brace, the compiler cannot tell where
assembly code stops, and it sees C statements after the _asm block as
assembly instructions.
Use C comments in _asm blocks written as macros.
Assembly-style comments that start with a semicolon (;) continue to the end
of the line. This causes problems in macros because the compiler ignores
everything after the comment, all the way to the end of the logical line.
The same is true of single-line C comments ( // comment ). To prevent
errors, use old-style C comments ( /* comment */ ) in _asm blocks defined as
macros.
An _asm block written as a C macro can take arguments but cannot return a
value.
An _asm block written as a C macro can take arguments. Unlike an ordinary C
macro, however, an _asm macro cannot return a value. So you cannot use such
macros in C expressions.
Be careful not to invoke macros of this type indiscriminately. For instance,
invoking an assembly-language macro in a function declared with the
_fastcall con-vention may cause unexpected results. (See Section 3.5, "Using
and Preserving Registers.")
You can convert MASM macros to C macros.
Note that some MASM-style macros can be written as C macros. Below is a MASM
macro that sets the video page to the value specified in the page
argument:
setpage MACRO page
mov ah, 5
mov al, page
int 10h
ENDM
The following code defines setpage as a C macro:
#define setpage( page ) _asm \
{ \
_asm mov ah, 5 \
_asm mov al, page \
_asm int 10h \
}
Both macros do the same job.
3.9 Optimizing
The presence of an _asm block in a function affects optimization in a few
different ways. First, as you might expect, the compiler doesn't try to
optimize the _asm block itself. What you write in assembly language is
exactly what you get.
Second, the presence of an _asm block affects register variable storage.
Under normal circumstances (unless you suppress optimization with the /Od
option) the compiler automatically stores variables in registers. This is
not done, however, in any function that contains an _asm block. To get
register variable storage in such a function, you must request it with the
register keyword.
Since the compiler stores register variables in the SI and DI registers,
these registers represent variables in functions that request register
storage. The first eligible variable is stored in SI and the second in DI.
Preserve SI and DI in such functions unless you want to change the register
variables.
Keep in mind that the name of a variable declared with register translates
directly into a register reference (assuming a register is available for
such use). For instance, if you declare
register int sample;
and the variable sample happens to be stored in SI, then the _asm
instruction
_asm mov ax, sample
is equivalent to
_asm mov ax, si
If you declare a variable with register and the compiler cannot store the
variable in a register, the compiler issues a warning to that effect at
compile time. The solution is to remove the register declaration from that
variable.
Register variables form a slight exception to the general rule that an
assembly-language statement can contain no more than one C symbol. If one of
the symbols is a register variable, for example,
register int v1;
int v2;
then an instruction can use two C symbols, as in
mov v1, v2
Finally, the presence of in-line assembly code inhibits the following
optimizations for the entire function in which the code appears:
■ Loop ( /Ol )
■ Global register allocation ( /Oe )
■ Global optimizations and common subexpressions ( /Og )
These optimizations are suppressed no matter which compiler options you use.
Chapter 4 Controlling Floating-Point Math Operations
────────────────────────────────────────────────────────────────────────────
This chapter describes how to control the way your Microsoft C programs
perform floating-point math operations. It describes the math packages that
you can include in C libraries when you run the SETUP program, then
discusses the options you can specify in the Programmer's WorkBench (PWB)
or on the CL command line to choose the appropriate library for linking and
controlling floating-point instructions.
This chapter also explains how to override floating-point options by
changing libraries at link time, and how to control use of the Intel math
coprocessor (80x87) using the NO87 environment variable.
4.1 Declaring Floating-Point Types
Microsoft C supports three floating-point types that conform to the
Institute of Electrical and Electronics Engineers (IEEE) standard 754
format:
1. Type float, a 32-bit floating-point quantity
2. Type double, a 64-bit floating-point quantity
3. Type long double, an 80-bit floating-point quantity
You can declare variables as any of these types. You can also declare
functions that return any of these types.
4.1.1 Declaring Variables as Floating-Point Types
You can declare variables as float, double, or long double, depending on the
needs of your application. The principal differences between the three types
are the significance they can represent, the storage they require, and their
range. Table 4.1 shows the relationship between significance and storage
requirements.
Table 4.1 Floating-Point Types
╓┌─────────────┌───────────────────┌─────────────────────────────────────────╖
Type Significant Digits Number of Bytes
────────────────────────────────────────────────────────────────────────────
float 6-7 4
double 15-16 8
Type Significant Digits Number of Bytes
────────────────────────────────────────────────────────────────────────────
double 15-16 8
long double 19 10
────────────────────────────────────────────────────────────────────────────
Floating-point variables are represented by a mantissa, which contains the
value of the number, and an exponent, which contains the order of magnitude
of the number.
Table 4.2 shows the number of bits allocated to the mantissa and the
exponent for each floating-point type. The most-significant bit of any
float, double, or long double is always the sign bit. If it is 1, the number
is considered negative; otherwise, it is considered a positive number.
Table 4.2 Lengths of Exponents and Mantissas
╓┌─────────────┌────────────────┌────────────────────────────────────────────╖
Type Exponent Length Mantissa Length
────────────────────────────────────────────────────────────────────────────
Type Exponent Length Mantissa Length
────────────────────────────────────────────────────────────────────────────
float 8 bits 23 bits
double 11 bits 52 bits
long double 15 bits 64 bits
────────────────────────────────────────────────────────────────────────────
Because exponents are stored in an unsigned form, the exponent is biased by
half its possible value. For type float, the bias is 127; for type double,
it is 1,023; for type long double, it is 16,383. You can compute the actual
exponent value by subtracting the bias value from the exponent value.
The mantissa is stored as a binary fraction greater than or equal to 1 and
less than 2. For types float and double, there is an implied leading 1 in
the mantissa in the most-significant bit position, so the mantissas are
actually 24 and 53 bits long, respectively, even though the most-significant
bit is never stored in memory.
Instead of the storage method just described, the floating-point package can
store binary floating-point numbers as denormalized numbers. Denormalized
numbers are nonzero floating-point numbers with reserved exponent values in
which the most-significant bit of the mantissa is zero. By using
denormalized format, the range of a floating-point number can be extended at
the cost of precision. You cannot control whether a floating-point number is
represented in normalized or denormalized form; the floating-point package
determines the representation. The floating-point packages never use
denormalized form unless the exponent becomes less than the minimum that can
be represented in a normalized form.
Table 4.3 shows the minimum and maximum value you can store in variables of
each floating-point type. The values listed in this table apply only to
normalized floating-point numbers; denormalized floating-point numbers have
a smaller minimum value. Note that numbers retained in 80x87 registers are
always represented in 80-bit normal form; numbers can only be represented in
denormal form when stored in 32- or 64-bit floating-point variables (type
float and type long).
Table 4.3 Range of Floating-Point Types
╓┌─────────────┌──────────────────────────────┌──────────────────────────────╖
Type Minimum Value Maximum Value
Type Minimum Value Maximum Value
────────────────────────────────────────────────────────────────────────────
float 1.175494351 E - 38 3.402823466 E + 38
double 2.2250738585072014 E - 308 1.7976931348623158 E + 308
long double 3.362103143112093503 E - 4932 1.189731495357231765 E + 4932
────────────────────────────────────────────────────────────────────────────
If precision is less of a concern than storage, consider using type float
for floating-point variables. Conversely, if precision is the most important
criterion, use type long double.
Microsoft C observes type-widening rules.
Floating-point variables can be promoted to a type of greater significance
(for example, from type float to type double). Promotion often occurs when
you perform arithmetic on floating-point variables. This arithmetic is
always done in as high a degree of precision as the variable with the
highest degree of precision. For example, consider the following type
declarations:
float f_short;
double f_long;
long double f_longer;
f_short = f_short * f_long;
In the preceding example, the variable f_short is promoted to type double
and multiplied by f_long; then the result is rounded to type float before
being assigned to f_short.
In the example below (which uses the declarations from the preceding
example), the arithmetic is done in float (32-bit) precision on the
variables; the result is then promoted to type long double.
f_longer = f_short * f_short;
4.1.2 Declaring Functions that Return Floating-Point Types
You can declare functions that return the floating-point types float,
double, and long double. Functions that return types float or double do not
place their return values in registers; they place their return values in a
global location called the floating-point accumulator ( fac).
When declaring a function as a floating-point type in a multithreaded
program for OS/2, you should use the _pascal keyword to specify the
FORTRAN/Pascal calling convention. Declaring the function as _pascal causes
the return value to be placed on the stack, rather than in the
floating-point accumulator, fac.
You can write re-entrant functions that return floating-point types.
Using the current thread's private stack to return values allows you to
write re-entrant functions by eliminating possible contention between
threads for the floating-point accumulator.
────────────────────────────────────────────────────────────────────────────
NOTE
Functions that return type long double always place their return values on
the stack. You need not use the _pascal keyword with functions declared as
long double.
────────────────────────────────────────────────────────────────────────────
4.2 C Run-Time Library Support of Type long double
All of the Microsoft C run-time libraries support type long double. Each of
the normal floating-point math functions has a special version that supports
type long double. These functions have the same name as the functions that
support type float and type double, except that they end with l. For
example, the function that returns the absolute value of a variable of type
float or type double is fabs. The long double equivalent function is fabsl.
The two exceptions to this rule are the _atold and _strtodl functions.
4.3 Summary of Math Packages
The Microsoft C compiler offers a choice of the following three math
packages for handling floating-point operations:
1. Emulator (default)
2. Math coprocessor (a library that supports the Intel 80x87 family of
math coprocessors)
3. Alternate math
When you install Microsoft C, the SETUP program allows you to build combined
libraries. These libraries include the floating-point math library that you
choose. Any programs linked with that library use the math package included
in the library; you must use the appropriate PWB or CL option to make sure
that the library you want is used at link time.
The following descriptions of these math packages are designed to help you
choose the appropriate math option for your needs when you build a library
using SETUP. For more information about SETUP and about building combined
libraries, see Installing and Using the Microsoft C Professional Development
System.
Note that this chapter does not describe mode-specific libraries. For
simplicity, the base names of libraries are noted in their default form;
that is mLIBCf.LIB, where m is the model designator and f is the
floating-point math package designator. For information about mode-specific
libraries, see Chapter 14, "Building OS/2 Applications," or Installing and
Using the Microsoft C Professional Development System.
4.3.1 Emulator Package
Programs created using the emulator math package automatically detect and
use an 80x87 numeric coprocessor if one is installed. If no coprocessor is
installed, these 80x87 instructions are carried out in software. The
emulator package is the default math package; SETUP uses it if you do not
explicitly choose another package. Also, the emulator math option is the
option selected by default by the compiler if no other floating-point math
option is specified.
Use the emulator math package to maximize accuracy on systems without math
coprocessors or if your program will be run on some systems with
coprocessors and some systems without coprocessors.
The emulator package performs basic operations to the same degree of
accuracy as a math coprocessor. However, the emulator routines used for
transcendental math functions (such as sin, cos, tan) differ slightly from
the corresponding functions performed on a coprocessor. This difference can
cause a slight discrepancy (usually within two bits) between the results of
these operations when performed with the software emulation instead of with
a math coprocessor.
When you use the emulator package, some floating-point exceptions are
masked.
When you use a math coprocessor or the emulator floating-point math package,
interrupt-enable, precision, underflow, and denormalized-operand exceptions
are masked by default. The remaining floating-point exceptions are unmasked.
See the discussion of the _control87 function in on-line help for more
information about 80x87 floating-point exceptions.
4.3.2 Math Coprocessor Package
The math coprocessor package utilizes the 80x87 math coprocessor exclusively
for floating-point calculations. If you use the math coprocessor package,
the machine on which your application is to run must have an 80x87
coprocessor to perform floating-point operations. This package gives you the
fastest, smallest programs possible for handling floating-point math.
4.3.3 Alternate Math Package
The alternate math package gives you the smallest and fastest programs
possible without a coprocessor. However, the program results are not as
accurate as results given by the emulator package.
The alternate math package uses the same format as the IEEE standard-format
numbers with less precision and weaker error checking. The alternate math
package does not support infinities, NANs ("not a number"), and denormal
numbers.
You must always use the alternate math package when developing routines that
are to be placed in an OS/2 dynamic-link library (DLL) using LLIBCDLL.LIB.
Do not, however, use the alternate math package for building the C run-time
DLL using CDLLOBJS.LIB; instead, use the emulator math package. For more
information about creating dynamic-link libraries for OS/2, see Chapter 16.
4.4 Selecting Floating-Point Options (/FP)
You can select a floating-point library and the method of accessing
floatingpoint routines by setting options in PWB or by specifying
command-line options to CL. You can choose between the emulator, alternate,
or math coprocessor library. You can also access the floating-point routines
by issuing a function call (or calls) or by generating in-line 80x87
instructions to execute the floating-point operation. The smallest and the
fastest floating-point math option is the in-line math coprocessor package
because the compiler generates true 80x87 coprocessor instructions. If,
however, you cannot depend on the target computer having a coprocessor, you
must use either the emulator or alternate math options.
To specify floating-point options on the CL command line, you must specify
an option from the list in Table 4.4. You specify these options to CL
starting with the floating-point option string /FP.
Based on the floating-point option and the memory-model option you choose,
the compiler embeds a library name in the object file that it creates. This
library is then considered the default library; that is, the linker searches
in the standard places for a library with that name. If it finds a library
with that name, the linker uses the library to resolve external references
in the object file being linked. Otherwise, it displays a message indicating
that it could not find the library.
This mechanism allows the linker to automatically link object files with the
appropriate library. However, you can link with a different library in some
cases. See Table 4.4 and Section 4.5, "Library Considerations for
Floating-Point Options," for more information about linking with different
libraries.
Table 4.4 summarizes the floating-point options and their effects. These
options are described in detail in the following sections.
Table 4.4 Summary of Floating-Point Options
╓┌──────────────────────┌──────────────┌──────────────┌────────────────────┌─►
Option for CL for PWB Combined Use Lib
of Method Effect Coprocessor Sel
Option for CL for PWB Combined Use Lib
of Method Effect Coprocessor Sel
─────────────────────────────────────────────────────────────────────────────
/FPi In-line Default; Uses coprocessor if mLI
In-Line larger than present(1)
Emulation /FPi87, but
can work
without a
coprocessor;
most
efficient way
to get
maximum
precision
without a
coprocessor
/FPi87 In-line Smallest and Requires mLI
In-Line Math fastest coprocessor
Coprocessor option
available
Option for CL for PWB Combined Use Lib
of Method Effect Coprocessor Sel
─────────────────────────────────────────────────────────────────────────────
available
with a
coprocessor
/FPc Calls Slower than Uses coprocessor if mLI
Calls to /FPi, but present(1)
Emulator allows use of
alternate
math library
at link time
/FPc87 Calls Slower than Requires mLI
Calls to Math /FPi87, but coprocessor unless
Coprocessor allows use of library changed at
alternate link time(5)
math library
at link time
Option for CL for PWB Combined Use Lib
of Method Effect Coprocessor Sel
─────────────────────────────────────────────────────────────────────────────
/FPa Calls Fastest and Ignores mLI
Alternate Math smallest coprocessor
option
available
without a
coprocessor,
but
sacrifices
some accuracy
for speed
─────────────────────────────────────────────────────────────────────────────
(1) Use of the coprocessor can be suppressed by setting NO87.
(2) Can be linked explicitly with mLIBC7.LIB at link time.
(3) Can be linked explicitly with mLIBCA.LIB at link time.
(4) Can be linked explicitly with mLIBCE.LIB at link time.
(5) Use of the coprocessor can be suppressed by setting NO87 if you change
to the emulator library at link time.
Optimizations such as constant propagation and constant subexpression
elimination can cause some expressions to be evaluated at compile time. Such
evaluations always use IEEE format and are unaffected by the floating-point
option you choose. For more information about optimizing, see Chapter 1,
"Optimizing C Programs."
You can specify floatingpoint options in the Programmer's WorkBench.
To specify floating-point options when using the Programmer's WorkBench, you
must modify the C Global Build Options (available on the Options menu). In
the C Global Build Options dialog box, select one of the following
floating-point math options:
Option Effect
────────────────────────────────────────────────────────────────────────────
Emulation Calls Generates calls; makes emulator math
library the
default (/FPc)
80x87 Calls Generates calls; makes math coprocessor
library the default (/FPc87)
Fast Alternate Math Generates calls; makes alternate math
library the
default (/FPa)
Inline Emulation Generates in-line instructions; makes
emulator math library the default
(/FPi); this is the default option
Inline 80x87 Generates in-line instructions; selects
Instructions math coprocessor library (/FPi87)
4.4.1 In-Line Emulator Option (/FPi)
The in-line emulator option (/FPi) generates in-line instructions for an
80x87 coprocessor and places the name of the emulator library (mLIBCE.LIB)
in the object file. At link time, you can specify the math coprocessor
library (mLIBC7.LIB) instead. If you do not choose a floating-point option,
the compiler uses the in-line emulator option by default.
The in-line emulator option is useful if you cannot be sure that an 80x87
coprocessor will be available on the target computer. Programs compiled
using the in-line emulator option work as described below:
■ If a coprocessor is present at run time, the program uses the
coprocessor.
■ If no coprocessor is present, the program uses the emulator. In this
case, the in-line emulator option offers the most efficient way to get
maximum precision in floating-point results.
When you use the in-line emulator option, the compiler does not generate
in-line 80x87 instructions. For real-mode code, the compiler generates
software interrupts to library code, which then fixes up the interrupts to
use either the emulator or the coprocessor, depending on whether a
coprocessor is present. For protected-mode code, the compiler generates no
such interrupts; it generates 80x87 instructions. If the target computer
does not have a coprocessor, an "unsupported extension" exception occurs,
which is vectored to library code. If you want true in-line 80x87
instructions, use the in-line math coprocessor option (/FPi87).
────────────────────────────────────────────────────────────────────────────
NOTE
In an OS/2 dynamic-link library built with LLIBCDLL.LIB, you cannot use code
that requires the emulator library. You must use the alternate math library
instead.
────────────────────────────────────────────────────────────────────────────
4.4.2 In-Line Math Coprocessor Instructions Option (/FPi87)
The in-line math coprocessor instructions option (/FPi87) instructs the
compiler to place 80x87 coprocessor instructions in your code for many math
operations. It also causes the name of a math coprocessor library
(mLIBC7.LIB) to be embedded in the object file.
If you use the in-line math coprocessor instructions option and link with
the library mLIBC7.LIB, an 80x87 coprocessor must be present at run time, or
the program fails and the following error message is displayed:
run-time error R6002
- floating point not loaded
Compiling with the in-line math coprocessor instructions option results in
the smallest, fastest programs possible for handling floating-point results.
4.4.3 Calls to Emulator Option (/FPc)
The calls to emulator option (/FPc) generates floating-point calls to the
emulator library and places the names of an emulator library (mLIBCE.LIB) in
the object file. At link time, you can specify a math coprocessor library
(mLIBC7.LIB) or an alternate math library (mLIBCA.LIB) instead. Thus, the
calls to emulator option gives you more flexibility in the libraries you can
use for linking than the in-line emulator option.
Using the calls to emulator option is also recommended in the following
cases:
■ If you compile modules that perform floating-point operations and plan
to include these modules in a library
■ If you compile modules that you want to link with libraries other than
the libraries provided with Microsoft C
You cannot link with an alternate math library if your program uses the
intrinsic forms of floating-point library routines (that is, if you have
compiled the program with the /Oi or /Ox option, selected the Generate
Intrinsic Functions option from the Debug Build Options or Release Build
Options dialog box in the Programmer's WorkBench, or specified math
functions in an intrinsic pragma).
4.4.4 Calls to Math Coprocessor Option (/FPc87)
The calls to math coprocessor option (/FPc87) generates function calls to
routines in the math coprocessor library (mLIBC7.LIB) that issue the
corresponding 80x87 instructions. As with the in-line math coprocessor
instructions option (/FPi87), at link time you can choose to link with an
emulator library (mLIBCE.LIB). However, /FPc offers more flexibility in
choosing libraries, since you can change your mind and link with the
appropriate alternate math library as well (mLIBCA.LIB).
The disadvantages of using the calls to math coprocessor option as opposed
to the in-line coprocessor option are the following:
■ Your executable size is larger because a call requires more
instructions than a true coprocessor instruction.
■ Your program does not execute as fast because you must issue a
function call for each floating-point operation.
You cannot link with an alternate math library if your program uses the
intrinsic forms of floating-point library routines (that is, if you have
compiled the program with the /Oi or /Ox option, selected the Generate
Intrinsic Functions option from the Debug Build Options or Release Build
Options dialog box in the Programmer's WorkBench, or specified math
functions in an intrinsic pragma).
You must have a math coprocessor installed to run programs compiled with the
/FPc option and linked with a math coprocessor library. Otherwise, the
program fails and the following error message is displayed:
run-time error R6002
- floating point not loaded
────────────────────────────────────────────────────────────────────────────
NOTE
Certain optimizations are not performed when you use the calls to math
coprocessor option. This can reduce the efficiency of your code; also, since
arithmetic of different precision can result, there may be slight
differences in your results.
────────────────────────────────────────────────────────────────────────────
4.4.5 Use Alternate Math Option (/FPa)
The use alternate math option (/FPa) generates floating-point calls and
selects the alternate math library for the appropriate memory model
(mLIBCA.LIB). Calls to this library provide the fastest and smallest option
for code intended to run on a machine without an 80x87 coprocessor. With
this option, you can choose an emulator library (mLIBCE.LIB) or a math
coprocessor library (mLIBC7.LIB) at link time.
You cannot link with an alternate math library if your program uses the
intrinsic forms of floating-point library routines (that is, if you have
compiled the program with the /Oi or /Ox option, selected the Generate
Intrinsic Functions from the Debug Build Options or Release Build Options
dialog box in the Programmer's WorkBench, or specified math functions in an
intrinsic pragma).
4.5 Library Considerations for Floating-Point Options
You may want to use libraries in addition to the default library for the
floating-point option you have chosen in your compile options. For example,
you may want to create your own libraries (or other collections of
subprograms in object-file form), then link these libraries at a later time
with object files that you have compiled using different options.
The following sections describe these cases and ways to handle them.
Although the discussion assumes that you are putting your object files into
libraries, the same considerations apply if you are simply using individual
object files.
4.5.1 Using One Standard Library for Linking
You must use only one standard C run-time library when you link. You can
control which library is used in one of two ways:
1. In the Programmer's WorkBench, add the name of the C run-time library
file you want to the program list using the Edit Program List option
from the Make menu. You must also modify the Linker Options (from the
Make menu) by specifying No Default Library Search.
2. From the LINK command line, give the /NODEFAULTLIBRARYSEARCH (/NOD)
option and then specify the name of the combined library file you want
to use in the link-libinfo field of the CL command line. This
overrides the library names embedded in the object files.
4.5.2 In-Line Instructions or Calls
When deciding on a floating-point option, you should decide whether you want
to use in-line instructions. If you do, compile with the in-line math
coprocessor instructions (/FPi87) or in-line emulator (/FPi) option.
Otherwise, compile for floating-point function calls using the calls to math
coprocessor (/FPc87), calls to emulator (/FPc), or alternate math (/FPa)
option.
If you choose to use in-line instructions for your precompiled object files,
you cannot link with an alternate math library (mLIBCA.LIB). However,
in-line instructions achieve the best performance from your programs on
machines that have an 80x87 coprocessor installed.
If you choose to use calls, your programs are slower, but at link time you
can switch to any standard C run-time library (that is, any library created
by the SETUP program) that supports the memory model you have chosen.
4.6 Compatibility between Floating-Point Options
Each time you compile a source file, you can specify a floating-point
option. When you link two or more source files to produce an executable
program file, you must ensure that floating-point operations are handled
consistently and that the environment is set up properly to allow the linker
to find the required library.
If you are building libraries of C routines that contain floating-point
operations, the calls to emulator option (/FPc) provides the most
flexibility.
The examples that follow illustrate how you can link your program with a
library other than the default. The floating-point option and the substitute
library are compatible.
The example below compiles the program CALC.C with the medium-model
option (/AM). Because no floating-point option is specified, the default
in-line emulator option (/FPi) is used. The in-line emulator option
generates 80x87 instructions and specifies the emulator library MLIBCE.LIB
in the object file. The /LINK field specifies the /NODEFAULTLIBRARYSEARCH
(/NOD) option and the names of the medium-model math coprocessor library.
Specifying the math coprocessor library forces the program to use an 80x87
coprocessor; the program fails if a coprocessor is not present.
CL /AM CALC.C /link MLIBC7 /NOD
The example below compiles CALC.C using the small (default) memory model
and the alternate math option (/FPa). The /LINK field specifies the /NOD
option and the library SLIBCE.LIB. Specifying the emulator library causes
all floating-point calls to refer to the emulator library instead of the
alternate math library.
CL /FPa CALC.C /link SLIBCE /NOD
The example below compiles CALC.C with the calls to math coprocessor
option (/FPc87), which places the library name SLIBC7.LIB in the object
file. The /LINK field overrides this default-library specification by giving
the /NOD option and the name of the small-model alternate math library
(SLIBCA.LIB).
CL /FPc87 CALC.C /link SLIBCA.LIB/NOD
4.7 Using the NO87 Environment Variable
Programs compiled using either the calls to emulator (/FPc) or the in-line
emulator (/FPi) option automatically use an 80x87 coprocessor at run time if
one is installed. You can override this and force the use of the software
emulator by setting an environment variable named NO87.
Use the NO87 environment variable to suppress use of the 80x87 coprocessor
at run time.
If NO87 is set to any value when the program is executed, use of the
coprocessor is suppressed. The value of the NO87 setting is printed on the
standard output as a message. The message is printed only if a coprocessor
is present and suppressed; if no coprocessor is present, no message appears.
If you don't want a message to be printed, set NO87 equal to one or more
spaces. A blank string for NO87 causes a blank line to be printed.
Note that only the presence or absence of the NO87 definition is important
in suppressing use of the coprocessor. The actual value of the NO87 setting
is used only for printing the message.
The NO87 variable takes effect with any program linked with an emulator
library (mLIBCE.LIB). It has no effect on programs linked with math
coprocessor libraries (mLIBC7.LIB) or programs linked with alternate math
libraries (mLIBCA.LIB).
When a program that uses an emulator library is executed and an 80x87
coprocessor is present, the example below causes the message Use of
coprocessor suppressed to appear.
SET NO87=Use of coprocessor suppressed
The syntax below sets the NO87 variable to the space character. Use of the
coprocessor is still suppressed, but no message is displayed.
SET NO87=space
4.8 Incompatibility Issues
The exception handler in the libraries for 80x87 floating-point calculations
(mLIBCE.LIB and mLIBC7.LIB) is designed to work without modification on the
IBM PC family of computers and on closely compatible computers, including
the WANG(R) PC, the AT&T(R) 6300, and the Olivetti(R) personal computers.
Also, the libraries need not be modified for the Texas Instruments(R)
Professional Computer, even though it is not compatible. Any machine that
uses nonmaskable interrupts (NMI) for 80x87 exceptions will run with the
unmodified libraries. If your computer is not one of these, and if you are
not sure whether it is completely compatible, you may need to modify the
math coprocessor libraries.
All Microsoft languages that support 80x87 coprocessors intercept 80x87
exceptions in order to produce accurate results and properly detect error
conditions. To make the libraries work correctly on incompatible machines,
you can modify the libraries. To make this easier, an assembly-language
source file, EMOEM.ASM, is included on the C 6.0 distribution disk. Any
machine that sends the 80x87 exception to an 8259 Priority Interrupt
Controller (master or master/slave) can be supported by a simple table
change to the EMOEM.ASM module. The source file contains further
instructions about how to modify EMOEM.ASM, patch libraries, and executable
files.
PART II Improving Programmer Productivity
────────────────────────────────────────────────────────────────────────────
The Microsoft C Professional Development System helps you write and debug
software rapidly.
Chapter 5 describes the quick compile and incremental compile options, both
of which can save you time when compiling programs. Chapter 5 also describes
the incremental linker, ILINK, which can save you time when you link your
application. Chapter 6 describes NMAKE, a powerful new program maintenance
utility that automates your program build process. Chapter 7 describes how
to build help files with HELPMAKE, the help-file maintenance utility. When
you need to share documentation in a readily accessible form, you can add it
to the Microsoft Advisor on-line help system using the information in
Chapter 7. Chapter 8 explains how to customize the Programmer's WorkBench to
make it a personalized development platform. Chapter 9 offers procedures
(and some tips) for using the CodeView debugger to find errors in your
programs.
Chapter 5 Compiling and Linking Quickly
────────────────────────────────────────────────────────────────────────────
The fundamental processes of compiling and linking take time to perform. The
larger your application grows, the longer it takes to compile and link.
This chapter describes how you can speed up compiling by using the quick
compiler and incremental compile option, and how you can speed up linking by
using ILINK, the Incremental Linker.
5.1 Compiling Quickly
This section describes two ways to speed up the compiling process: using the
quick compiler and using the incremental compile option.
5.1.1 Quick Compiler
The Microsoft C Professional Development System includes two separate C
compilers: the full compiler and the quick compiler. If you don't specify
otherwise, your program is compiled by the full compiler.
You access the quick compiler by specifying the /qc command-line option for
CL or by selecting the Quick Compile option from the C Release Build or C
Debug Build Options dialogs in the PWB Options menu.
The quick compiler cannot perform as many optimizations as the full
compiler, but it is much faster. You can use it to save time during
development, whenever optimizations are not critical. When your application
is finished, you can compile with the full compiler, using all the desired
optimizations.
On-line help for the /qc option describes which optimizations the quick
compiler can perform.
5.1.2 Incremental Compile Option
You can speed up compiling even more by compiling incrementally. Incremental
compilation means that the compiler compiles only those functions that have
changed since you last compiled.
The incremental compile option is available only with the quick compiler
(see the previous section). You can access it from within PWB or from the
DOS command line. Within PWB, select the Incremental Compile option in the C
Release Build dialog box or in the C Debug Build Options dialog box. From
the DOS command line, specify the /Gi option for CL.
The incremental compile option automatically triggers another time-saving
feature: the Incremental Linker, which is described in the next section.
5.2 Linking Quickly with ILINK
ILINK links only those modules that have changed since the last link.
The Incremental Linker (ILINK) offers the same advantage in linking that the
incremental compile option offers in compiling. Rather than link every
module in an application, as LINK does, ILINK links only those modules that
have changed since the last link. The more modules your application
contains, the more time ILINK can potentially save.
In a normal development scenario, you use LINK at the beginning and end of
the process, and use ILINK in the middle. In the early stages of
development, when your application contains only a few modules, ILINK offers
no speed advantage over LINK. Once your application contains several
modules, you can save time by using ILINK.
You must link once with LINK to prepare for incremental linking.
To prepare for incremental linking, you must run LINK using /INCREMENTAL, as
described in Section 5.2.1. At the same time, you have the option of adding
padding bytes to code or data segments by specifying the /PADCODE and
/PADDATA options. Padding allows ILINK to expand a segment without relinking
the entire module in which it is contained.
Now you can link with ILINK during the rest of development. If changes in
your code require a full link, ILINK invokes LINK automatically. When the
application is finished, you link a last time with LINK to produce the final
executable file.
You can use ILINK with programs compiled for any memory model except tiny
model. (Memory models are described in Chapter 2, "Managing Memory.")
Typically, ILINK is not efficient for small- or compact-model programs
unless they were compiled with the incremental compile option, which is
described in Section 5.1.2.
5.2.1 Preparing for Incremental Linking
There are three LINK options that relate to the use of ILINK. One of them
(/INCREMENTAL) is mandatory; the other two (/PADCODE and /PADDATA) are
optional. This section explains the LINK options that prepare for ILINK. See
on-line help for a complete list of LINK options.
The /INCREMENTAL Option
The /INCREMENTAL (/INC) option prepares an object file for incremental
linking. You must always run LINK using this option before using ILINK. When
you specify /INC, the linker produces two extra files: a symbol file (.SYM)
and an ILINK support file (.ILK). The .SYM and .ILK files tell ILINK which
parts of the executable file need to be updated.
You must use /INCREMENTAL whenever you use the /PADCODE and /PADDATA
options, which are described below.
The /PADCODE Option
The /PADCODE option causes LINK to add padding bytes at the end of a
module's code segment. The padding bytes leave room for the code segment to
grow in subsequent links, allowing ILINK to update only that module. You can
use the /PADCODE option only when /INC is also specified.
Code padding is usually necessary for programs using the small memory model.
It is also recommended for compact- or mixed-model programs. You do not need
to specify /PADCODE for other memory models (medium, large, or huge).
If you don't specify /PADCODE, LINK doesn't pad the code segment at all. To
add padding, specify the desired number of bytes. The optimum amount of
padding depends on how much your code changes from one link to the next. If
you expect to add only a little code, choose a relatively small amount of
padding, say 32 to 64 bytes. If ILINK issues the message
padding exceeded
and performs a full link more often than desired, increase the padding by a
small amount, say 32 bytes. In any case, remember that the total size of a
code segment, including padding bytes, cannot exceed 64K (65,535) bytes.
The /PADDATA Option
Like /PADCODE, the /PADDATA option causes LINK to add padding bytes that
leave room for the segment to grow in subsequent links. However, the
/PADDATA option pads the end of the data segment rather than the code
segment. You can use /PADDATA only when /INC is also specified.
If you don't specify /PADDATA, LINK adds 16 bytes of padding by default. The
default padding amount should suffice in many cases, since public variables
are added less frequently than code. If you need more padding, specify the
desired number of bytes. Remember that the total size of a data segment,
including padding bytes, cannot exceed 64K (65,535) bytes.
5.2.2 Incremental Violations
ILINK can generate two kinds of errors: real errors and incremental
violations. Real errors are errors such as undefined symbols that cannot be
resolved by a full link. If ILINK detects a real error, it displays an error
message (real errors are documented in on-line help).
Incremental violations are caused by code changes you have made that go
beyond the scope of incremental linking. When an incremental violation
occurs, ILINK invokes LINK automatically. The following sections describe
the incremental violations.
Changing Libraries
An incremental violation occurs when a library changes. Furthermore, if an
altered module shares a code segment with a library, ILINK needs access to
the library as well as to the altered module.
If you add a function, procedure, or subroutine call to a library that has
never been called before, ILINK invokes LINK automatically.
Exceeding Code/Data Padding
An incremental violation occurs if two or more modules contribute to the
same physical segment and either module exceeds its padding. The padding
allows the module to increase the specified number of bytes before another
full link is required.
Moving or Deleting Data Symbols
An incremental violation occurs if a data symbol is moved or deleted. To add
new data symbols without requiring a full link, add the new symbols at the
end of all other data symbols in the module.
Deleting Code Symbols
You can move or add code symbols, but an incremental violation occurs if you
delete any code symbols from a module. Code symbols can be moved within a
module but cannot be moved between modules.
Changing Segment Definitions
An incremental violation results if you add, delete, or change the order of
segment definitions.
Adding CodeView(R) Debugger Information
If you include CodeView debugger information for a module when you fully
link (by compiling and linking with CodeView debugger support), ILINK
supports CodeView debugger information for the module. ILINK maintains
symbolic information for current symbols, and it adds information for any
new symbols. However, if you try to add CodeView debugger information for a
module that did not previously have CodeView debugger support, an
incremental violation occurs. See Chapter 9, "Debugging C Programs with
CodeView," for more information about CodeView.
Chapter 6 Managing Development Projects with NMAKE
────────────────────────────────────────────────────────────────────────────
The Microsoft Program-Maintenance Utility (NMAKE) is a sophisticated command
processor that can save time and simplify project management. By determining
which project files depend on others, NMAKE can automatically execute the
commands needed to update your project when any project file has changed.
The advantage of using NMAKE over simple batch files is that NMAKE does only
what is needed. You don't waste time rebuilding files that are already
up-to-date. NMAKE also has advanced features, such as macros, that help you
manage complex projects.
This chapter provides complete documentation for NMAKE. Information about
NMAKE is also available in on-line help. If you are familiar with MAKE, the
predecessor of NMAKE, be sure to read Section 6.9, "Differences Between
NMAKE and MAKE." There are some important differences between the two
utilities.
6.1 Overview of NMAKE
NMAKE works by comparing the times and dates of two sets of files, which are
called "targets" and "dependents." A target is normally a file that you want
to create, such as an executable file. A dependent is a file used to create
a target, such as a C source file.
When you run NMAKE, it reads a "description file" that you supply. The
description file consists of one or more blocks. Each block typically lists
a target, the target's dependents, and the command that builds the target.
NMAKE compares the date and time of the target to those of its dependents.
If any dependent has changed more recently than the target, NMAKE updates
the target by executing the command listed in the block.
NMAKE's main purpose is to help you update applications quickly and simply.
However, it can execute any command, so it is not limited to compiling and
linking. NMAKE can also make backups, move files, and do many other project
management tasks.
6.2 The NMAKE Command
When you run NMAKE, you can supply the description-file name and other
arguments using the following syntax:
NMAKE «options» «macros» «targets» «descriptfile»
All of the command-line fields are optional. If you don't supply any
arguments, NMAKE looks for a default description file named MAKEFILE and
follows various other defaults that are described in this chapter.
The options field lists NMAKE options, which are described in Section 6.4,
"Command-Line Options."
The macros field lists macro definitions, which allow you to replace text in
the description file. Macros are described in Section 6.3.3.
The targets field lists targets to build. If you do not list any targets,
NMAKE builds only the first target in the description file. (This is a
significant departure from the behavior of MAKE, NMAKE's predecessor. See
Section 6.9, "Differences between NMAKE and MAKE.")
The descriptfile field specifies a description file. If this field is
absent, NMAKE automatically looks for a file named MAKEFILE in the current
directory. You can also specify the description file with the /F option (for
information, see Section 6.4, "Command-Line Options").
Below is a typical NMAKE command:
NMAKE /S "program = sample" sort.exe search.exe
The command supplies four arguments: an option (/S), a macro definition
("program = sample"), and two target specifications (sort.exe search.exe).
Because the command does not specify a description file, NMAKE looks for the
default description file, MAKEFILE. The /S option tells NMAKE to suppress
the display of commands as they are executed. The macro definition performs
a text substitution throughout the description file, replacing every
instance of program with sample. The target specifications tell NMAKE to
update the targets SORT.EXE and SEARCH.EXE.
6.3 NMAKE Description Files
You must always supply NMAKE with a description file. In addition to
description blocks, which tell NMAKE how to build your project's target
files, the description file can contain comments, macros, inference rules,
and directives. This section describes all the elements of description
files.
6.3.1 Description Blocks
Description blocks form the heart of the description file. Figure 6.1
illustrates a typical NMAKE description block, including the three parts:
targets, dependents, and commands.
(This figure may be found in the printed book.)
A target is a file that you want to build.
The targets part of the description block lists one or more files to build.
The line that lists targets and dependents is called the "dependency line."
The example in Figure 6.1 tells NMAKE to build a single target, MYAPP.EXE.
Although single targets are common, you can also list multiple targets;
separate each target name with a space. If the rightmost target name is one
character long, put a space between the name and the colon.
The target is normally a file, but it can also be a "pseudotarget," a name
that allows you to build groups of files or execute a group of commands. See
Section 6.3.6, "Pseudotargets."
A dependent is a file used to build a target.
The dependents part of the description block lists one or more files from
which the target is built. It is separated from the targets part by a colon.
The example in Figure 6.1 lists three dependents:
myapp.exe : myapp.obj another.obj myapp.def
The example tells NMAKE to build the target MYAPP.EXE whenever MYAPP.OBJ,
ANOTHER.OBJ, or MYAPP.DEF has changed more recently than MYAPP.EXE.
If any dependents of a target are listed as targets in other description
blocks, then NMAKE builds those files before it builds the original target.
Essentially NMAKE evaluates a "dependency tree" for the entire description
file. It builds files in the order needed to update the original target,
never building a target until all files that depend on it are up-to-date.
The dependent list can also include a list of directories in which NMAKE
should search for dependents. The directory list is enclosed in curly braces
( {} ) and precedes the dependent list. NMAKE searches the current directory
first, then the directories you list:
forward.exe : {\src\alpha;d:\proj}pass.obj
In the line above, the target, FORWARD.EXE, has one dependent: PASS.OBJ. The
directory list specifies two directories:
{\src\alpha;d:\proj}
NMAKE begins searching for PASS.OBJ in the current directory. If it is not
found, NMAKE searches the \ SRC \ ALPHA directory, then the D:\ PROJ
directory. If NMAKE cannot find a dependent in the current directory or a
listed directory, it looks for an inference rule that describes how to
create the dependent (see Section 6.3.4, "Inference Rules").
The commands part of a description block can contain one or more
commands.
The commands part of the description block lists the command(s) NMAKE should
use to build the target. This can be any command that you can execute from
the command line. The example tells NMAKE to build MYAPP.EXE using the
following LINK command:
LINK myapp another.obj, /align:16, NUL, os2, myapp
Notice that the line above is indented. NMAKE uses indentation to
distinguish between the dependency line and command line. If the command
appears on a separate line, as here, it must be indented at least one space
or tab. The dependency line must not be indented (it cannot start with a
space or tab).
Many targets are built with a single command, but you can place more than
one command after the dependency line. A long command can span several lines
if each line ends with a backslash ( \ ).
You can also place the command at the end of the dependency line. Separate
the command from the rightmost dependent with a semicolon.
In OS/2 description files, NMAKE imposes a slight restriction on the use of
the CD, CHDIR, and SET commands. Do not place any of these commands on a
command line that uses the ampersand (&) to execute multiple commands. For
instance, the following command line is legal in an OS/2 description file,
DIR & COPY sample.c backup.c
but this line is not legal because it places a CD command after the
ampersand:
DIR & CD \mydir
To use CD, CHDIR, or SET in a description block, place the command on a
separate line:
DIR
CD \mydir
Your OS/2 user's documentation contains more information about using the
ampersand in command lines.
Wild Cards
You can use DOS wild-card characters (* and ?) to specify target and
dependent file names. NMAKE expands wild cards in target names when it reads
the description file. It expands wild cards in the dependent names when it
builds the target. For example, the following description block compiles all
source files with the .C extension:
bondo.exe : *.c
CL *.c
Command Modifiers
Command modifiers provide extra control over the command listed in a
description block. They are special characters that appear in front of a
command. You can use more than one modifier for a single command. Table 6.1
describes the three NMAKE command modifiers.
Table 6.1 Command Modifiers
╓┌─────────────────────────────────┌─────────────────────────────────────────╖
Character Action
────────────────────────────────────────────────────────────────────────────
At sign (@) Prevents NMAKE from displaying the
command as it executes. In the example
below, NMAKE does not display the ECHO
command line:
sort.exe : sort.obj
@ECHO sorting
The output of the ECHO command appears
as usual.
Dash (-) Turns off error checking for the command.
Character Action
────────────────────────────────────────────────────────────────────────────
Dash (-) Turns off error checking for the command.
If the dash is followed by a number,
NMAKE stops only if the error level
returned by the command is greater than
the number. In the following example, if
the program sample returned an error
code NMAKE does not stop but continues
to execute commands:
light.lst : light.txt
-sample light.txt
Exclamation point (!) Executes the command for each dependent
file if the command uses the predefined
macros $? or $**. The $? macro refers to
all dependent files that are out-of-date
with respect to the target. The $**
macro refers to all dependent files in
the description block (see Section 6.3.3,
Character Action
────────────────────────────────────────────────────────────────────────────
the description block (see Section 6.3.3,
"Macros"). For example,
print:hop.asm skip.bas jump.c
!print $** lpt1:
generates the following commands:
print hop.asm lpt1:
print skip.bas lpt1:
print jump.c lpt1:
────────────────────────────────────────────────────────────────────────────
Using Control Characters as Literals
Occasionally, you may need to list a file name that contains a character
that NMAKE uses as a control character. These characters are
# ( ) $ ^ \ { } ! @ -
To use an NMAKE control character as a literal character, place a caret (^)
in front of it. For example, say that you define a macro that ends with a
backslash:
exepath=c:\bin\
The line above is intended to define a macro named exepath with the value
c:\bin\. But the second backslash causes unexpected results. Since the
back-slash is the NMAKE line-continuation character, the line actually
defines the macro exepath as c:\bin followed by whatever appears on the
next line of the description file. You can solve the problem by placing a
caret in front of the second backslash:
exepath=c:\bin^\
You can also use a caret to place a literal newline character in a
description file. This feature can be useful in macro definitions:
XYZ=abc^
def
NMAKE interprets the example as if you assigned the C-style string abc\ndef
to the XYZ macro. This effect differs from using the backslash ( \s ) to
continue a line. A newline character that follows a backslash is replaced
with a space.
Carets that precede noncontrol characters are ignored. The line
ign^ore : these ca^rets
is interpreted as
ignore : these carets
A caret that appears in quotation marks is treated as a literal caret
character.
Listing a Target in Multiple Description Blocks
You can specify more than one description block for the same target by
placing two colons (::) after the target. This feature can be useful for
building a complex target, such as a library, that contains components
created with different commands. For example,
target.lib :: a.asm b.asm c.asm
CL a.asm b.asm c.asm
LIB target -+a.obj -+b.obj -+c.obj;
target.lib :: d.c e.c
CL /c d.c e.c
LIB target -+d.obj -+e.obj;
Both description blocks update the library named TARGET.LIB. If any of the
assembly-language files have changed more recently than the library, NMAKE
executes the commands in the first block to assemble the source files and
update
the library. Similarly, if any of the C-language files have changed, NMAKE
executes the second group of commands, which compile the C files and update
the library.
If you use a single colon in the example above, NMAKE issues an error
message. It is legal, however, to use single colons if commands are listed
in only one block. In this case, dependency lines are cumulative. For
example,
target: jump.bas
target: up.c
echo Building target...
is equivalent to
target: jump.bas up.c
echo Building target...
6.3.2 Comments
You can place comments in a description file by preceding them with a number
sign (#):
# This comment appears on its own line
huey.exe : huey.obj dewey.obj # Comment on the same line
link huey.obj dewey.obj;
A comment extends to the end of the line in which it appears. Command lines
cannot contain comments.
6.3.3 Macros
Macros allow you to do text replacements throughout the description file.
Macros offer a convenient way to replace a string in the description file
with another string. The text is automatically replaced each time you run
NMAKE. Macros are useful in a variety of tasks, including the following:
■ To create a standard description file for several projects. The macro
represents the file names used in commands. These file names are then
defined when you run NMAKE. When you switch to a different project,
you can change file names throughout the description file by changing
a single macro.
■ To control the options that NMAKE passes to the compiler or linker.
When you specify options in a macro, you can change options throughout
the description file in one easy step.
You can define your own macros or use predefined macros. This section begins
by describing user-defined macros.
User-Defined Macros
You can define a macro with
macroname = string
The macroname can be any combination of letters, digits, and the underscore
( _ ) character. Macro names are case sensitive. NMAKE interprets MyMacro
and MYMACRO as different macro names.
The string can be any string, including a null string. For example,
command = LINK
defines a macro named command and assigns it the string LINK.
You can define macros in the description file or on the command line. In the
description file, you must define each macro on a separate line; the line
cannot start with a space or tab. The string can contain embedded spaces,
and NMAKE ignores spaces on either side of the equal sign. You do not need
to enclose string in quotation marks (if you do, they become part of the
string).
Slightly different rules apply when you define a macro on the command line,
because of the way that the command line handles spaces. You must enclose
string in quotation marks if it contains embedded spaces. No spaces can
surround the equal sign. You can also enclose the entire macro definition,
macroname and string, in quotation marks. For example,
NMAKE "program=sample"
defines the macro program, assigning it the value sample.
Once you have defined a macro, you can "undefine" it with the !UNDEF
directive (see Section 6.3.5, "Directives").
Invoking Macros
You invoke a macro by enclosing its name in parentheses preceded by a dollar
sign ($). (The parentheses are optional if macroname is one character long.)
For example, you can invoke the command macro defined above as
$(command)
When NMAKE runs, it replaces every occurrence of $(command) with LINK.
The following description file defines and uses three macros:
program = sample
c = LINK
options =
$(program).exe : $(program).obj
$c $(options) $(program).obj;
NMAKE interprets the description block as
sample.exe : sample.obj
LINK sample.obj;
NMAKE replaces every occurrence of $(program) with sample, every instance
of $c with LINK, and every instance of $(options) with a null string.
Because c is only one character long, you do not need to enclose it in
parentheses.
If you invoke a macro that is not defined, NMAKE treats the macro as a null
string.
Occasionally, you may need to use the dollar sign ($) as a literal
character. Use two signs ($$), or precede it with a caret (^$).
Predefined Macros
NMAKE provides several predefined macros, which represent various file names
and commands. Predefined macros are useful in their own right, and they are
also employed in predefined inference rules, which are described later in
this chapter. Table 6.2 lists NMAKE predefined macros.
Table 6.2 Predefined Macros
╓┌─────────────────────────────────┌─────────────────────────────────────────╖
Macro Meaning
────────────────────────────────────────────────────────────────────────────
$@ The current target's full name.
$* The current target's base name (full
name minus the file
extension).
$** The dependents of the current target.
Macro Meaning
────────────────────────────────────────────────────────────────────────────
$? The dependents that are out-of-date with
respect to the current target.
$$@ The target that NMAKE is currently
evaluating. You can only use this macro
to specify a dependent.
$< The dependent file that is out-of-date
with respect to the current target
(evaluated only for inference rules).
$(CC) The command to invoke the C compiler. By
default, $(CC) is predefined as CC = cl,
which invokes the optimizing compiler.
$(AS) The command that invokes the Microsoft
Macro Assembler. NMAKE predefines this
macro as AS = masm.
Macro Meaning
────────────────────────────────────────────────────────────────────────────
macro as AS = masm.
Table 6.2 (continued)
╓┌─────────────────────────────────┌─────────────────────────────────────────╖
Macro Meaning
────────────────────────────────────────────────────────────────────────────
$(MAKE) The name with which the NMAKE utility is
invoked. This macro is used to invoke
NMAKE recursively. It causes the line on
which it appears to be executed even if
the /N option is on. You can redefine
this macro if you want to execute
another program.
The $(MAKE) macro is useful for building
different versions of a program. The
following description file invokes NMAKE
Macro Meaning
────────────────────────────────────────────────────────────────────────────
following description file invokes NMAKE
recursively to build targets in the
VERS1 and VERS2 directories.
all :vers1 vers2
versl :
cd versl
$(MAKE)
cd . .
vers2 :
cd vers2
$(MAKE)
cd . .
The example changes to the VERS1
directory, then invokes NMAKE
recursively, causing NMAKE to process
the file MAKEFILE in that directory.
Then it changes to the VERS2 directory
Macro Meaning
────────────────────────────────────────────────────────────────────────────
Then it changes to the VERS2 directory
and invokes NMAKE again, processing the
file MAKEFILE in that directory.
Deeply recursive build procedures can
exhaust NMAKE's run-time stack, causing
a run-time error. To eliminate the error,
use the EXEHDR utility to increase
NMAKE's run-time stack. The following
command, for example, gives NMAKE.EXE a
stack size of 16,384 (0x4000) bytes:
exehdr /stack:0x4000 nmake.exe
$(MAKEFLAGS) The NMAKE options currently in effect.
If you invoke NMAKE recursively, you
should use the command: $(MAKE)
$(MAKEFLAGS). You cannot redefine this
macro.
Macro Meaning
────────────────────────────────────────────────────────────────────────────
$(MAKEDIR) The directory from which NMAKE is
invoked.
────────────────────────────────────────────────────────────────────────────
Like user-defined macro names, predefined macro names are case sensitive.
NMAKE interprets CC and cc as different macro names.
Macro modifiers allow you to specify parts of predefined macros
representing file names.
You can append characters to any of the first six macros in Table 6.2 to
modify its meaning. Appending a D specifies the directory part of the file
name only, an F specifies the file name, a B specifies just the base name,
and an R specifies the complete file name without the extension. If you add
one of these characters, you must enclose the macro name in parentheses.
(The predefined macros $$@ and $** are the only exceptions to the rule that
macro names more than one character long must be enclosed in parentheses.)
For example, assume that $@ has the value C:\ SOURCE \ PROG \ SORT.OBJ. The
list below shows the effect of combining the special characters with $@:
Macro Value
────────────────────────────────────────────────────────────────────────────
$(@D) C:\ SOURCE \ PROG
$(@F) SORT.OBJ
$(@B) SORT
$(@R) C:\ SOURCE \ PROG \ SORT
For example, in the code below, the macro $? represents the names of all
dependents that are more recent than the target. The exclamation point
causes NMAKE to execute the LIB command once for each dependent in the list.
As a result, the LIB command is executed up to three times, each time
replacing a module with a newer version.
trig.lib : sin.obj cos.obj arctan.obj
!LIB trig.lib -+$?;
In the following example, NMAKE updates a group of include files:
# Include files depend on versions in current directory
DIR=c:\include
$(DIR)\globals.h : globals.h
COPY globals.h $@
$(DIR)\types.h : types.h
COPY types.h $@
$(DIR)\macros.h : macros.h
COPY macros.h $@
Each of the files GLOBALS.H, TYPES.H, and MACROS.H in the directory C:\
INCLUDE depends on its counterpart in the current directory. If one of the
include files is out-of-date, NMAKE replaces it with the file of the same
name from the current directory.
Substitution within Macros
Just as macros allow you to substitute text in a description file, you can
also substitute text within a macro itself. Use the following form:
$(macroname:string1 = string2)
You can replace text in a macro, as well as in the description file.
Every occurrence of string1 is replaced by string2 in the macro macroname.
Do not put any spaces or tabs between macroname and the colon. Spaces
between the colon and string1 are made part of string1. If string2 is a null
string, all occurrences of string1 are deleted from the macroname macro.
The following description file illustrates macro substitution:
SRCS = prog.c sub1.c sub2.c
prog.exe : $(SRCS:.c=.obj)
LINK $**;
DUP : $(SRCS)
!COPY $** c:\backup
The predefined macro $** stands for the names of all the dependent files
(see the previous section). If you invoke the example file with a command
line that specifies both targets, NMAKE executes the following commands:
LINK prog.obj sub1.obj sub2.obj;
COPY prog.c c:\backup
COPY sub1.c c:\backup
COPY sub2.c c:\backup
The macro substitution does not alter the definition of the SRCS macro,
rather, it simply replaces the listed characters. When NMAKE builds the
target PROG.EXE, it gets the definition for the predefined macro $** (the
dependent list) from the dependency line, which specifies the macro
substitution in SRCS. The same is true for the second target, DUP. In this
case, however, no macro substitution is requested, so SRCS retains its
original value, and $** represents the names of the C source files. (In the
example above, the target DUP is a pseudotarget; Section 6.3.6 describes
pseudotargets.)
You can also perform substitution in the following predefined macros: $@,
$*, $**, $?, and $. The principle is the same as for other macros. The
command in the following description block substitutes within a predefined
macro:
target.abc : depend.xyz
echo $(@:targ=blank)
If dependent depend.xyz is out-of-date relative to target target.abc,
then NMAKE executes the command
echo blanket.abc
The example uses the predefined macro $@, which equals the full name of the
current target ( target.abc). It substitutes blank for targ in the
target, resulting in blanket.abc. Note that you do not put the usual dollar
sign in front of the predefined macro. The example uses
$(@:targ=blank)
instead of
$($@:targ=blank)
to substitute within the predefined macro $@.
Inherited Macros
When NMAKE executes, it creates macros equivalent to every current
environment variable. These are called "inherited" macros because they have
the same names and values as the corresponding environment variables. (The
inherited macro is all uppercase, however, even if the corresponding
environment variable is not.)
Inherited macros can be used like other macros. You can also redefine them.
The following example redefines the inherited macro PATH:
PATH = c:\tools\bin
sample.obj : sample.c
CL /c sample.c
Inherited macros take their definitions from environment variables.
No matter what value PATH had in the DOS environment, it has the value
c:\tools\bin when NMAKE executes the CL command in this description block.
Redefining the inherited macro does not affect the original environment
variable; when NMAKE terminates, PATH has its original value.
The /E option defeats macro inheritance. If you supply this option, NMAKE
ignores any attempt to redefine a macro that derives from an environment
variable.
Precedence among Macro Definitions
If you define the same macro in more than one place, NMAKE uses the macro
with the highest precedence. The precedence from highest to lowest is as
follows:
1. Macros defined on the command line
2. Macros defined in a description file or include file
3. Inherited macros
4. Macros defined in the TOOLS.INI file
5. Predefined macros such as CC and AS
The /E option defeats any attempt to redefine inherited macros. If you run
NMAKE with this option, macros inherited from environment variables override
any same-named macros in the description file.
6.3.4 Inference Rules
Inference rules are templates that NMAKE uses to create files with a given
extension. For instance, when NMAKE encounters a description block with no
commands, it tries to apply an inference rule that tells how to create the
target from the dependent files, given the two extensions. Similarly, if a
dependent file does not exist, NMAKE tries to apply an inference rule that
tells how to create the missing dependent from another file with the same
base name.
Inference rules tell NMAKE how to create files with a certain extension.
Inference rules provide a convenient shorthand for common operations. For
instance, you can use an inference rule to avoid repeating the same command
in several description blocks.
You can define your own inference rules or use predefined inference rules.
This section begins by describing user-defined inference rules.
User-Defined Inference Rules
You can define inference rules in the description file or in the TOOLS.INI
file. An inference-rule definition lists two file extensions and one or more
commands. For instance, the following inference rule tells NMAKE how to
build a .OBJ file using a .C file:
.C.OBJ:
CL /c $<;
The first line lists two extensions. The second extension (.OBJ) specifies
the type of the desired file and the first (.C) specifies the type of the
desired file's dependent. The second line lists the command used to build
the desired file. Here, the predefined macro $ represents the name of a
dependent that is out-of-date relative to the target.
NMAKE could apply the above inference rule to the following description
block:
sample.obj :
The description block lists only a target, SAMPLE.OBJ. Both the dependent
and the command are missing. However, given the target's base name and
extension, plus the above inference rule, NMAKE has enough information to
build the target. NMAKE first looks for a .C file with the same base name as
the target. If SAMPLE.C exists, NMAKE compares its date to that of
SAMPLE.OBJ (the comparison is triggered by the predefined macro $). If
SAMPLE.C has changed more recently, NMAKE compiles it using the CL command
listed in the inference rule:
CL/c sample.c
────────────────────────────────────────────────────────────────────────────
NOTE
NMAKE applies an inference rule only if the base name of the file it is
trying to create matches the base name of a file that already exists. Thus,
inference rules are useful only when there is a one-to-one correspondence
between the desired file and its dependent. You cannot define an inference
rule that replaces several modules in a library, for example.
────────────────────────────────────────────────────────────────────────────
Extension Search Paths
If an inference rule does not specify a search path, as in the example
above, NMAKE looks for files in the current directory. You can specify a
single path for each of the extensions, using the following form:
{frompath}. fromext{topath}. toext:
commands
NMAKE searches in the frompath directory for files with the fromext
extension. It uses commands to create files with the toext extension in the
topath directory.
Predefined Inference Rules
NMAKE provides predefined inference rules to perform these common
development tasks:
■ Creating an .OBJ file by compiling a .C file
■ Creating an .OBJ file by assembling an .ASM file
■ Creating an .EXE file by compiling a .C file and linking the resulting
.OBJ file
Table 6.3 describes the predefined inference rules.
Table 6.3 Predefined Inference Rules
╓┌───────────────┌─────────────────────────┌─────────────────────────────────╖
Inference Rule Command Default Action
────────────────────────────────────────────────────────────────────────────
.c.obj $(CC) $(CFLAGS) /c $*.c cl /c $*.c
.asm.obj $(AS) $(AFLAGS) $*; masm $*;
.c.exe $(CC) $(CFLAGS) $*.c cl $*.c
────────────────────────────────────────────────────────────────────────────
For example, say that you have the following description file:
sample.exe :
Like the previous example, this description block lists a target without any
dependents or commands. NMAKE looks at the target's extension (.EXE) and
checks for an inference rule that describes how to create a .EXE file. The
last rule in Table 6.3 provides this information:
.c.exe:
$(CC) $(CFLAGS) $*.c
To apply this rule, NMAKE first looks for a file with the same base name as
the target (SAMPLE) and the .C extension. If SAMPLE.C exists in the current
directory, NMAKE executes the CL command given in the rule. The command
compiles SAMPLE.C and links the resulting file SAMPLE.OBJ to create
SAMPLE.EXE.
Precedence among Inference Rules
If the same inference rule is defined in more than one place, NMAKE uses the
rule with the highest precedence. The precedence from highest to lowest is
1. Inference rules defined in the description file
2. Inference rules defined in the TOOLS.INI file
3. Predefined inference rules
NMAKE uses a predefined inference rule only if no user-defined inference
rule exists for the desired operation.
6.3.5 Directives
Directives allow you to write description files that are similar to batch
files. Directives can execute commands conditionally, display error
messages, include other files, and turn on or off certain options.
NMAKE directives are similar to C preprocessor directives.
A directive begins with an exclamation point (!), which must appear at the
beginning of the line. You can place spaces between the exclamation point
and the directive keyword. (See Table 6.4.)
Table 6.4 Directives
╓┌────────────────────────┌──────────────────────────────────────────────────╖
Directive Description
────────────────────────────────────────────────────────────────────────────
!CMDSWITCHES Turns on or off one of four NMAKE options: /D, /I,
{+| -}opt...
/N, and /S. If no options are specified, the
options are reset to the way they were when NMAKE
started. Turn an option on by preceding it with a
plus sign (+), or turn it off by preceding it
with a minus sign (-). Using this keyword updates
the MAKEFLAGS macro.
!ELSE Executes the statements between the !ELSE and
!ENDIF keywords if the statements preceding the
!ELSE keyword were not executed.
!ENDIF Marks the end of the !IF, !IFDEF, or !IFNDEF
block of statements.
Directive Description
────────────────────────────────────────────────────────────────────────────
!ERROR text Causes text to be printed and then stops
execution.
!IF constantexpression Executes the statements between the !IF keyword
and the next !ELSE or !ENDIF keyword if constant
expression evaluates to a nonzero value.
!IFDEF macroname Executes the statements between the !IFDEF
keyword and the next !ELSE or !ENDIF keyword if
macroname is defined. NMAKE considers a macro
with a null value to be defined.
!IFNDEF macroname Executes the statements between the !IFNDEF
keyword and the next !ELSE or !ENDIF keyword if
macroname is not defined.
!INCLUDE filename Reads and evaluates the file filename before
continuing with the current description file. If
Directive Description
────────────────────────────────────────────────────────────────────────────
continuing with the current description file. If
filename is enclosed by angle brackets (< >),
NMAKE searches for the file in the directories
specified by the INCLUDE macro. Otherwise, it
looks only in the current directory. The
INCLUDE macro is initially set to the value of
the
INCLUDE environment variable.
!UNDEF macroname Marks macroname as being undefined in NMAKE's
symbol table.
────────────────────────────────────────────────────────────────────────────
The constantexpression used with the !IF directive can consist of integer
constants, string constants, or program invocations. Integer constants can
use the C unary operators for numerical negation (-), one's complement (~),
and logical negation (!). They can also use any of the C binary operators
listed in Table 6.5.
Table 6.5 Directive Operators
╓┌─────────────────────┌─────────────────────────────────────────────────────╖
Operator Description
────────────────────────────────────────────────────────────────────────────
+ Addition
- Subtraction
* Multiplication
/ Division
% Modulus
& Bitwise AND
| Bitwise OR
^^ Bitwise XOR
&& Logical AND
|| Logical OR
<< Left shift
>> Right shift
== Equality
Operator Description
────────────────────────────────────────────────────────────────────────────
== Equality
!= Inequality
< Less than
> Greater than
<= Less than or equal to
>= Greater than or equal to
────────────────────────────────────────────────────────────────────────────
You can group expressions using parentheses. NMAKE treats numbers as decimal
unless they start with 0 (octal) or 0x (hexadecimal). Use the equality (==)
operator to compare two strings for equality or the inequality (!=) operator
to compare for inequality. Enclose strings with quotes. Program invocations
must be in square brackets ([ ]).
The following example illustrates directives:
!INCLUDE <infrules.txt>
!CMDSWITCHES +D
winner.exe:winner.obj
!IFDEF debug
! IF "$(debug)"=="y"
LINK /CO winner.obj;
! ELSE
LINK winner.obj;
! ENDIF
!ELSE
! ERROR Macro named debug is not defined.
!ENDIF
The !INCLUDE directive causes NMAKE to insert the file INFRULES.TXT into the
description file. The !CMDSWITCHES directive turns on the /D option, which
displays the dates of the files as they are checked. If WINNER.EXE is
out-of-date with respect to WINNER.OBJ, the !IFDEF directive checks to see
if the macro debug is defined. If it is defined, the !IF directive checks
to see if it is set to y. If it is, the linker is invoked with the /CO
option; otherwise it is invoked without. If the debug macro is not
defined, the !ERROR directive prints the message and NMAKE stops.
6.3.6 Pseudotargets
Pseudotargets are useful for building a group of files or executing a group
of commands.
A "pseudotarget" is similar to a target, but it is not a file. It is a name
that serves as a "handle" for building a group of files or executing a group
of commands. In the following example, UPDATE is a pseudotarget.
UPDATE: *.*
!COPY $** a:\product
When NMAKE evaluates a pseudotarget, it always considers the dependents to
be out-of-date. In the example, NMAKE copies each of the dependent files to
the specified drive and directory.
Like macro names, pseudotarget names are case sensitive. Predefined
pseudotarget names are all uppercase.
The pseudotargets in Table 6.6 are predefined to provide special rules in a
description file. You can use their names on the command line, in a
description file, or in the TOOLS.INI file. You need not specify them as
targets; NMAKE uses the rules they define no matter where they appear.
Table 6.6 Pseudotargets
╓┌─────────────────────────────────┌─────────────────────────────────────────╖
Pseudotarget Action
────────────────────────────────────────────────────────────────────────────
.IGNORE: Ignores exit codes returned by programs
called from the description file. Same
effect as invoking NMAKE with the /I
option.
.PRECIOUS: target(s) Tells NMAKE not to delete target(s) if
the commands that build it are quit or
interrupted. Using this pseudotarget
overrides the NMAKE default. By default,
NMAKE deletes the target if it cannot be
sure the target is built successfully.
The .PRECIOUS pseudotarget is rarely
Pseudotarget Action
────────────────────────────────────────────────────────────────────────────
The .PRECIOUS pseudotarget is rarely
needed. Like most professional tools,
Microsoft language tools clean up by
themselves when errors occur.
.SILENT: Does not display lines as they are
executed. Same effect as invoking NMAKE
with the /S option.
.SUFFIXES:list Lists file suffixes for NMAKE to try
when building a target file for which no
dependents are specified. This list is
used together with inference rules. See
Section 6.3.4, "Inference Rules."
When NMAKE finds a target without any
dependents, it searches the current
directory for a file with the same base
name as the target and a suffix from the
Pseudotarget Action
────────────────────────────────────────────────────────────────────────────
name as the target and a suffix from the
list. If NMAKE finds such a file, and if
an inference rule applies to the file,
then NMAKE treats the file as a depen-
dent of the target. The order of the
suffixes in the list defines the order
in which NMAKE searches for the file.
The list is predefined as follows:
.SUFFIXES: .obj .exe .c .asm
To add suffixes to the list, specify
.SUFFIXES :
followed by the new suffixes. To clear
the list, specify .SUFFIXES:
────────────────────────────────────────────────────────────────────────────
6.3.7 PWB's extmake Syntax
NMAKE description files can use the same syntax as the extmake switch of PWB
(see Chapter 8, "Customizing the Microsoft Programmer's WorkBench"). This
syntax allows you to determine the drive, path, base name, and extension of
the first dependent, information that is not otherwise available. The file
name, and parts of its name, are represented using the syntax
%|partsF
where parts is one or more of the following:
Letter Description
────────────────────────────────────────────────────────────────────────────
d Drive
e File extension
f File base name
p Path
s Complete name
The following example uses extmake syntax:
sample.obj : sample.c
CL /Fod:%|pfF %|dfeF
In this example, the sequence %|pfF represents the path and base name of
the first dependent file, while the sequence %|dfeF represents the drive,
base name, and extension of the same file. The example, then, compiles the
file and writes the output to a file on the same path but with the default
.OBJ extension.
The percent symbol (%) is a replacement character in DOS and OS/2 command
lines in the description file. To use extmake syntax in command-line
arguments, specify each percent symbol as a double percent symbol (%%).
6.4 Command-Line Options
NMAKE accepts a number of options, which are listed in Table 6.7. You can
specify options in uppercase or lowercase and use either a slash or dash.
For example, -A, /A, -a, and /a all represent the same option.
Table 6.7 NMAKE Options
╓┌─────────────────────────────────┌─────────────────────────────────────────╖
Option Action
────────────────────────────────────────────────────────────────────────────
/A Builds all of the requested targets even
if they are not out-of-date.
/C Suppresses nonfatal error or warning
messages and the NMAKE logo display.
/D Displays the modification date of each
file.
/E Causes environment variables to override
Option Action
────────────────────────────────────────────────────────────────────────────
/E Causes environment variables to override
macro definitions in description files.
See Section 6.3.3, "Macros."
/F filename Specifies filename as the name of the
description file. If you supply a dash
(-) instead of a file name, NMAKE gets
input from the standard input device
instead of the description file.
/HELP Calls the QuickHelp utility. If the
QuickHelp program is not available,
NMAKE displays the most commonly used
NMAKE options.
/I Ignores return codes from commands
listed in the description file. NMAKE
processes the whole description file
even if errors occur.
Option Action
────────────────────────────────────────────────────────────────────────────
even if errors occur.
/N Displays but does not execute the
description file's commands. This option
is useful for debugging description
files and checking which targets are
out-of-date.
/NOLOGO Suppresses the NMAKE logo display.
/P Displays all macro definitions and
target descriptions on the standard
output device.
/Q Returns zero if the target is up-to-date
and nonzero if it is not. This option is
useful when running NMAKE from a batch
file.
Option Action
────────────────────────────────────────────────────────────────────────────
/R Ignores inference rules and macros that
are predefined or defined in the
TOOLS.INI file.
/S Suppresses the display of commands
listed in the description file.
/T Changes the modification dates for
out-of-date target files to the current
date.
/X filename Sends all error output to filename,
which can be a file or a device. If you
supply a dash (-) instead of a file name,
the error output is sent to the standard
output device.
/Z Used for internal communication between
Option Action
────────────────────────────────────────────────────────────────────────────
/Z Used for internal communication between
NMAKE and PWB.
/? Displays a brief summary of NMAKE syntax
and exits to the operating system.
────────────────────────────────────────────────────────────────────────────
The following command specifies two NMAKE options:
NMAKE /f sample.mak /c targ1 targ2
The /f option tells NMAKE to read the description file SAMPLE.MAK. The /c
option tells NMAKE not to display nonfatal error messages and warnings. The
command lists two targets (targ1 and targ2) to update.
NMAKE /D /N targ1 targ1.mak
In the example above, NMAKE updates the target targ1. If the current
directory does not contain a file named MAKEFILE, NMAKE reads the file
TARG1.MAK as the description file. The /D option displays the modification
date of each file; the /N option displays the commands without executing
them.
6.5 NMAKE Command Files
Occasionally, you may need to give NMAKE a long list of command-line
arguments that exceeds the maximum length of a command line (128 characters
in DOS, 256 in OS/2). To do this, place the command arguments in a file,
then give the name of the file when you run NMAKE.
For instance, say that you create a file named UPDATE, which consists of
this line:
/S "program = sample" sort.exe search.exe
If you start NMAKE with the command
NMAKE @update
NMAKE reads its command-line arguments from UPDATE. The at sign (@) tells
NMAKE to read arguments from the file. The effect is the same as if you
typed the arguments directly on the command line:
NMAKE /S "program = sample" sort.exe search.exe
Within the file, line breaks between arguments are treated as spaces. Macro
definitions that contain spaces must be enclosed in quotation marks, just as
if you typed them on the command line. You can continue a macro definition
across multiple lines by ending each line except the last with a backslash (
\ ):
/S "program \
= sample" sort.exe search.exe
This file is equivalent to the first example. The backslash in the example
allows the macro definition ("program = sample" ) to span two lines.
6.6 The TOOLS.INI File
You can customize NMAKE by placing commonly used macros and inference rules
in the TOOLS.INI initialization file. Settings for NMAKE must follow a line
that begins with [NMAKE]. This part of the initialization file can contain
macro definitions, .SUFFIXES lists, and inference rules. For example,
[NMAKE]
CC=cl
CFLAGS=-Gc -Gs -W3 -Oat
.c.obj:
$(CC) -c $(CFLAGS) $*.c
If TOOLS.INI contains the code above, NMAKE reads and applies the lines
following [NMAKE]. The example defines the macros CC and CFLAGS and
redefines the inference rule for making .OBJ files from .C sources.
NMAKE looks for TOOLS.INI in the current directory. If it is not found
there, NMAKE searches the directory specified by the INIT environment
variable.
6.7 In-Line Files
NMAKE can write "in-line files," which can contain any text you specify. One
use for in-line files is to write a response file for another utility such
as LIB. (Response files are useful when you need to supply a program with a
long list of arguments that exceeds the maximum length of the command line.)
Use this syntax to create an in-line file:
target : dependents
command << «filename»
inlinetext
<<«KEEP | NOKEEP»
All of the text between the two sets of double angle brackets () is placed
in the in-line file. The filename is optional. If you don't supply filename,
NMAKE gives the in-line file a unique name. NMAKE places the in-line file in
the current directory or, if the TMP environment variable is defined, in the
directory specified by TMP.
The in-line file can be temporary or permanent. If you don't specify
otherwise, or if you specify NOKEEP, it is temporary. Specify KEEP to retain
the file.
The following example creates a LIB response file named LIB.LRF:
math.lib : add.obj sub.obj mul.obj div.obj
LIB @<<lib.lrf
math.lib
-+add.obj-+sub.obj-+mul.obj-+div.obj
listing
<<KEEP
The resulting response file tells LIB which library to use, the commands to
execute, and the listing file to produce:
math.lib
-+add.obj-+sub.obj-+mul.obj-+div.obj
listing
The in-line file specification can create more than one in-line file. For
instance,
target.abc : depend.xyz
cat <<file1 <<file2
I am the contents of file1.
<<KEEP
I am the contents of file2.
<<KEEP
The example creates two in-line files named FILE1 and FILE2; then NMAKE
executes the command:
CAT file1 file2
The KEEP keywords tell NMAKE not to delete FILE1 and FILE2 when done.
6.8 NMAKE Operations Sequence
If you are writing a complex description file, you may need to know the
exact order of steps that NMAKE follows. This section describes those steps
in order.
When you run NMAKE from the command line, its first task is to find the
description file, following these steps:
1. If NMAKE is invoked with the /F option, it uses the file name
specified in the option.
2. If /F is not specified, NMAKE looks for a file named MAKEFILE in the
current directory. If such a file exists, it is used as a description
file.
3. If MAKEFILE is not in the current directory, NMAKE parses the command
line for the first string that is not an option or a macro definition
and treats this string as a file name. If the file-name extension does
not appear in the .SUFFIXES list, NMAKE uses the file as the
description file. If the extension appears in the .SUFFIXES list,
NMAKE tries additional strings until it finds a suitable file. (See
Section 6.3.6, "Pseudotargets," for a description of the .SUFFIXES
list.)
4. If NMAKE still has not found a description file, it returns an error.
NMAKE stops searching for a description file as soon as it finds one, even
if other potential description files exist. If you specify /F, NMAKE uses
the file specified by that option even if MAKEFILE exists in the current
directory. Similarly, if NMAKE uses MAKEFILE, any description file listed in
the command line is treated as a target.
If you do not specify targets, NMAKE updates only the first target in the
description file.
Next, NMAKE updates every target listed on the command line. If none is
listed, NMAKE updates only the first target in the description file. (This
behavior differs from the older MAKE program's default; see Section 6.9,
"Differences between NMAKE and MAKE.")
NMAKE then applies macro definitions and inference rules in the following
order, from highest to lowest priority:
1. Macros defined on the command line
2. Macros defined in a description file or include file
3. Inherited macros
4. Macros defined in the TOOLS.INI file
5. Predefined macros such as CC and AS
Definitions in later steps take precedence over definitions in earlier
steps. The /E option, however, causes inherited macros to override macros
defined on the command line. The /R option causes NMAKE to ignore macros and
inference rules that are predefined or defined in TOOLS.INI.
Now NMAKE updates each target in the order in which it appears in the
description file. It compares the date and time of each dependent with that
of the target and performs the commands needed to update the target. If you
specify the /A option or if the target is a pseudotarget, NMAKE updates the
target even if its dependents are not out-of-date.
If the target has no explicit dependents, NMAKE looks in the current
directory for one or more files whose extensions are in the .SUFFIXES list.
If it finds such files, NMAKE treats them as dependents and updates the
target according to the commands.
If no commands are given to update the target or if the dependents cannot be
found, NMAKE applies inference rules to build the target. By default, it
tries to build .EXE files from .OBJ files; and it tries to build .OBJ files
from .C and .ASM sources. In practice, this means you should specify .OBJ
files as dependents, because NMAKE compiles your source files when it can't
find the .OBJ files.
NMAKE normally quits processing the description file when a command returns
an error. In addition, if it cannot tell that the target was built
successfully, NMAKE deletes the partially created target. If you use the /I
commandline option, NMAKE ignores exit codes and attempts to continue
processing. The .IGNORE pseudotarget has the same effect. To prevent NMAKE
from deleting the partially created target, specify the target name in the
.PRECIOUS pseudotarget.
Alternatively, you can use the dash (-) command modifier to ignore the error
code for an individual command. An optional number after the dash tells
NMAKE to continue if the command returns an error code that is less than or
equal to the number, and to stop if the error code is greater than the
number.
You can help document errors by using the !ERROR directive to print
descriptive text. The directive causes NMAKE to print some text, then stop,
even if you use /I, .IGNORE, or the dash (-) modifier.
6.9 Differences between NMAKE and MAKE
As its name implies, NMAKE is a new utility that replaces the older
Microsoft MAKE program. NMAKE differs from MAKE in the following ways:
■ NMAKE does not evaluate targets sequentially. Instead, NMAKE updates
the targets you specify when you invoke it, regardless of their
positions in the description file. If no targets are specified, NMAKE
updates only the first target in the file.
■ NMAKE accepts command-line arguments from a file.
■ NMAKE provides more command-line options.
■ NMAKE provides more predefined macros.
■ NMAKE permits substitutions within macros.
■ NMAKE supports directives placed in the description file.
■ NMAKE allows you to specify include files in the description file.
The first item in the list deserves special emphasis. While MAKE normally
builds every target, working from beginning to end of the description file,
NMAKE expects you to specify targets on the command line. If you do not,
NMAKE builds only the first target in the description file.
The difference is clear if you run NMAKE using a typical MAKE description
file, which lists a series of subordinate targets followed by a higher-level
target that depends on the subordinates:
pmapp.obj : pmapp.c
CL /c /G2sw /W3 pmapp.c
pmapp.exe : pmapp.obj pmapp.def
LINK pmapp, /align:16, NUL, os2, pmapp
MAKE builds both targets (PMAPP.OBJ and PMAPP.EXE), but NMAKE builds only
the first target (PMAPP.OBJ).
Because of these performance differences, you may want to convert MAKE files
to NMAKE files. MAKE description files are easy to convert. A simple method
is to create a new description block at the beginning of the file. Give this
block a pseudotarget named ALL and list the top-level target as a
dependent of ALL. To build ALL, NMAKE must update every target upon which
the target of ALL depends:
ALL : pmapp.exe
pmapp.obj : pmapp.c
CL /c /G2sw /W3 pmapp.c
pmapp.exe : pmapp.obj pmapp.def
LINK pmapp, /align:16, NUL, os2, pmapp
If the above file is named MAKEFILE, you can update the target PMAPP.EXE
with the command
NMAKE
or the command
NMAKE ALL
Note that it is not necessary to list PMAPP.OBJ as a dependent of ALL.
NMAKE builds a dependency tree for the entire description file, and builds
whatever files are needed to update PMAPP.EXE. So if PMAPP.C is out-ofdate
with respect to PMAPP.OBJ, NMAKE compiles PMAPP.C to create PMAPP.OBJ, then
links PMAPP.OBJ to create PMAPP.EXE.
The same technique is suitable for description files with more than one
top-level target. List all of the top-level targets as dependents of ALL:
ALL : pmapp.exe second.exe another.exe
The example updates the targets PMAPP.EXE, SECOND.EXE, and ANOTHER.EXE.
If the description file lists a single, top-level target, you can use an
even simpler technique. Move the top-level block to the beginning of the
file:
pmapp.exe : pmapp.obj pmapp.def
LINK pmapp, /align:16, NUL, os2, pmapp
pmapp.obj : pmapp.c
CL /c /G2sw /W3 pmapp.c
NMAKE updates the second target (PMAPP.OBJ) whenever needed to keep the
first target (PMAPP.EXE) current.
Chapter 7 Creating Help Files with HELPMAKE
────────────────────────────────────────────────────────────────────────────
If you have used PWB or other Microsoft language products such as QuickC,
you are familiar with the many advantages of on-line help. The Microsoft
Help-File-Creation Utility (HELPMAKE) allows you to create your own help
files for use with Microsoft products. It also allows you to customize the
help files supplied with Microsoft language products.
HELPMAKE translates help text files into a help database accessible from
within the following:
■ Microsoft C 6.0 Programmer's WorkBench (PWB)
■ QuickHelp Utility
■ Microsoft Editor 1.02
■ Microsoft QuickC 2.0
■ Microsoft QuickPascal 1.0
■ Microsoft QuickBASIC 4.5
This chapter describes how to create and modify help files using the
HELPMAKE utility.
7.1 Structure and Contents of a Help Database
HELPMAKE creates a help database from one or more input files that contain
information formatted for the help system. This section defines some of the
terms involved in formatting and outlines the formats that HELPMAKE can
process.
7.1.1 Contents of a Help File
As you might expect, each help text file starts with a topic and some
information about the topic, then lists another topic and some information
about it, and so on. In HELPMAKE terminology, topics are called "contexts";
the information is called "topic text."
The .context command introduces a context. In the source file for C 6.0
help, for example, this line introduces help for the open function:
.context open
The .context command and other formatting elements are described in Section
7.5, "Help Text Conventions."
Whether a context is one or several words depends on the application.
QuickBASIC, for example, considers spaces to be delimiters, so in QuickBASIC
help files contexts are limited to a single word. Other applications, such
as the Microsoft Editor, can handle contexts that span several words. Either
way, the application simply hands the context to an internal "help engine,"
which searches the database for information.
Often, especially with library routines, the same information applies to
more than one subject. For example, the string-to-number functions strtod,
strtol, and stroul share the same help text. The help file lists all three
function names as contexts for one block of topic text. The converse,
however, is not true. You cannot specify different blocks of topic text, in
different places in the help file, to describe a single subject.
Cross-references help you navigate through a help database.
Cross-references make it possible to view information about related topics,
including header files and code examples. The help for the open function,
for example, references the access function and the ASCII header file
FCNTL.H. Cross-references can point to other contexts in the same help
database, to contexts in other help databases, or to ASCII files outside the
database.
Help files can have two kinds of cross-references:
■ Implicit
■ Explicit, or hyperlinks
Implicit cross-references are coded with an ordinary .context command.
The word "open" is an implicit cross-reference throughout C 6.0 help. If you
select the word "open" anywhere in C 6.0 help, the help system displays
information on the open function. As illustrated above, the context for open
begins with an ordinary .context command. As a result, anywhere that you
select "open," the help system references this context.
Hyperlinks are explicit cross-references marked by invisible text.
A "hyperlink" is an explicit cross-reference tied to a word or phrase at a
specific location in the help file. You create hyperlinks when you write the
help text. The hyperlink consists of a word or phrase followed by invisible
text that gives the context to which the hyperlink refers.
For example, to cause an instance of the word "formatting" to display help
on the printf function, you would create an explicit cross-reference from
the word "formatting" to the context "printf." Elsewhere in the file,
"formatting" has no special significance but, at that one position, it
references the help for printf. Section 7.5.4 describes how to create
hyperlinks.
Formatting flags let you change the appearance of text.
Help text can also include formatting flags to control the appearance of the
text on the screen. Using these flags, you can make certain words appear in
various colors, inverse video, and so forth, depending on the application
displaying help and the graphics capabilities of the host computer.
7.1.2 Help File Formats
You can create help files using any of three formats:
■ QuickHelp format
■ Rich Text Format (RTF)
■ Minimally formatted ASCII
In addition, you can reference unformatted ASCII files, such as include
files, from within a help database.
An entire help system (such as the one supplied with Microsoft C or
QuickBASIC) can use any combination of files formatted with different format
types. With C, for example, the README.DOC information file is encoded as
minimally formatted ASCII; the help files for the PWB, C language, and
run-time library are encoded in the QuickHelp format. The database also
cross-references the header (include) files, which are unformatted ASCII
files stored outside the database.
QuickHelp
QuickHelp format is the default and is the format into which HELPMAKE
decodes help databases. Use any text editor to create a QuickHelp-format
help text file. QuickHelp format also lends itself to a relatively easy
automated translation from other document formats.
QuickHelp files can contain any kind of cross-reference or formatting
attribute. Typically, you use QuickHelp format for any changes to a database
supplied by Microsoft.
RTF
Rich Text Format (RTF) is a Microsoft word-processing format that many other
word processors also support. You can create RTF help text with any word
processor that generates RTF output. You can also use any utility program
that takes word-processor output and produces an RTF file.
Use RTF when you want to transfer help files from one application to another
while retaining formatting information. You can format RTF files directly
with the word-processing program; you need not edit them to insert any
special commands or tags. Like QuickHelp files, RTF files can contain
formatting attributes and cross-references.
Minimally Formatted ASCII
Minimally formatted ASCII files simply define contexts and their topic text.
These files cannot contain screen-formatting commands or explicit
crossreferences (implicit cross-references are allowed). They are often used
to display text such as README.DOC and small help files that do not require
compression.
Unformatted ASCII
Unformatted ASCII files are exactly what their name implies: regular ASCII
files with no special formatting commands, context definitions, or special
information. An unformatted ASCII file does not become part of the help
database. Only its name is used as the object of a cross-reference. The
standard C header (include) files are unformatted ASCII files used for
cross-references by the help system for the C run-time library. Unformatted
ASCII files are also useful for storing program examples.
7.2 Invoking HELPMAKE
The HELPMAKE program can encode or decode help files, allowing you to create
new help files or modify existing ones. Encoding converts a text file to a
compressed help database. HELPMAKE can encode text files written in
QuickHelp, RTF, and minimally formatted ASCII format. Decoding converts a
help database to a text file for editing. HELPMAKE always decodes a help
database into a QuickHelp format text file.
Invoke HELPMAKE with the following syntax:
HELPMAKE «options» { /En | /D } { sourcefiles }
The options modify the action of HELPMAKE; they are described in Section
7.3.
Use the /E option to encode with HELPMAKE and use the /D option to decode.
You must supply either the /E (encode) or the /D (decode) option. When
encoding (/E) to create a help database, you must use the /O option to
specify the file name of the database.
The sourcefile field is required. It specifies the input file for HELPMAKE.
If you use the /D (decode) option, sourcefile can be one or more help
database files (such as QC.HLP). HELPMAKE decodes the database files into a
single text file. If you use the /E (encode) option, sourcefile can be one
or more help text files (such as QC.SRC). Separate file names with a space.
Standard wild-card characters can also be used.
The example below invokes HELPMAKE with the /V, /E, and /O options (see
Section 7.3.1, "Options for Encoding"). HELPMAKE reads input from the text
file my.txt and writes the compressed help database in the file my.hlp.
The /E option causes maximum compression. Note that the DOS redirection
symbol (>) sends a log of HELPMAKE activity to the file my.log. You may
find it helpful to redirect the log file because, in its more verbose modes
(given by /V), HELPMAKE may generate a lengthy log.
HELPMAKE /V /E /Omy.hlp my.txt > my.log
The example below invokes HELPMAKE to decode the help database my.hlp into
the text file my.src, given with the /O option. Once again, the /V option
results in verbose output, and the output is directed to the log file
my.log. Section 7.3.2 describes additional options for decoding.
HELPMAKE /V /D /Omy.src my.hlp > my.log
7.3 HELPMAKE Options
HELPMAKE accepts a number of command-line options, which are described
below. You can specify options in uppercase or lowercase letters, and
precede them with either a forward slash ( / ) or a dash (-). For example,
-L, /L, -l, and /l all represent the same option. Most options apply only to
encoding; others apply only to decoding; and a few apply to both.
7.3.1 Options for Encoding
When you encode a file─that is, when you build a help database─you must
specify the /E option. In addition, you can supply various other options
that control the way HELPMAKE works. All the options that apply when
encoding are listed below:
Option Action
────────────────────────────────────────────────────────────────────────────
/Ac Specifies c as an application-specific
control character for the help database
file. The character marks a line that
contains special information for
internal use by the application. For
example, QuickC uses the colon (:).
/C Indicates that the context strings for
this help file are case sensitive. At
run time, all searches for help topics
are case sensitive if the help database
was built with the /C option in effect.
/E«n» Creates (encodes) a help database from a
specified text file. The optional n
indicates the amount of compression to
take place. If n is omitted, HELPMAKE
compresses the file as much as possible,
thereby reducing the size of the file by
about 50%. The more compression
requested, the longer HELPMAKE takes to
create a database file. The value of n
is a number in the range 0 - 15. It is
the sum of successive powers of 2
representing various compression
techniques, as listed below:
Value Technique
────────────────────────────────────────────────────────────────────────────
0 No compression
1 Run-length compression
2 Key word compression
4 Extended key word
compression
8 Huffman compression
Add values to combine compression
techniques. For example, use /E3 to get
run-length and key word compression.
This is useful in the testing stages of
creating a help database when you need
to create the database quickly and are
not too concerned with size.
/H Displays a summary of HELPMAKE syntax
and exits.
/HELP Invokes QH.EXE, the QuickHelp utility,
for help about HELPMAKE. If QuickHelp is
not available, displays the same
information as the /H option.
/K filename Optimizes key word compression by
supplying a
list of characters that act as word
separators. The filename is a file
containing your list of separator
characters.
When you select key word compression,
HELPMAKE scans the help file to identify
"key words." A key word is any word that
occurs often enough to justify replacing
it with a shorter character sequence.
HELPMAKE normally uses the following
characters as word separators:
■ All characters from 0-32 (including
the space)
■ !"#&'( )*+'-, /:;<=>?@[\]^_`{|}~
■ 127
When performing key word compression,
HELPMAKE treats as a word any series of
characters not appearing in the
separator list.
Depending on the content of your help
file, you may be able to improve key
word compression by using the /K option
to specify a different list of separator
characters. For instance, the default
separator list contains the number sign
(#). If your help file contains #include
directives, HELPMAKE normally treats
#include as the word include without a
number sign. To cause HELPMAKE to treat
#include as a word, you could specify
the following separator list:
!"&'()*+'-,/:;<=>?@[\]^_`{|}~
The list above does not include the
number sign. HELPMAKE always treats
characters in the range
0-32 as separators, so you do not need
to include them. Your list must include
all the other characters you want
HELPMAKE to use as separators, including
the space.
/L Locks the generated file so that it
cannot be decoded by HELPMAKE at a later
time.
/Odestfile Specifies destfile as the name of the
help database.
/Sn Specifies the type of input file,
according to the following n values:
Option File Type
────────────────────────────────────────────────────────────────────────────
/S1 Rich Text Format (RTF)
/S2 QuickHelp (default)
/S3 Minimally formatted ASCII
/T Translates dot commands into internal
format. If your help file contains dot
commands other than .context, you should
supply this option when encoding it. Dot
commands are described in Section 7.6.1,
"QuickHelp Format," and in later
sections.
/V«n» Indicates the verbosity of diagnostic
and informational output, depending on
the value of n. Increasing the value
adds more information to the output. If
you omit this option or specify only /V,
HELPMAKE gives you its most verbose
output. The possible values of n are
listed below:
Option Effect
────────────────────────────────────────────────────────────────────────────
/V Maximum diagnostic output
/V0 No diagnostic output and no
banner
/V1 Prints only HELPMAKE banner
(default)
/V2 Prints pass names
/V3 Prints contexts on first
pass
/V4 Prints contexts on each pass
/V5 Prints any intermediate
steps within each pass
/V6 Prints statistics on help
file and compression
/Wwidth Indicates the fixed width of the
resulting help text in number of
characters. The values of width can
range from 11 to 255. If the /W option
is omitted, the default is 76. When
encoding RTF source (/S1), HELPMAKE
automatically formats the text to width.
When encoding QuickHelp (/S2) or
minimally formatted ASCII (/S3) files,
HELPMAKE truncates lines to this width.
7.3.2 Options for Decoding
To decode a help database into QuickHelp files, you must use the /D option.
In addition, HELPMAKE accepts other options to control the decoding process.
The list below shows all the options that are valid when decoding:
Option Action
────────────────────────────────────────────────────────────────────────────
/D«letter» Decodes the input file into its original
text or component parts. If a
destination file is not specified with
the /O option, the help file is decoded
to stdout. HELPMAKE decodes the file
differently depending on the letter
specified:
Letter Effect
────────────────────────────────────────────────────────────────────────────
/D "Decode." Fully decodes the
help database, leaving all
cross-references and
formatting information
intact.
/DS "Decode split." Splits the
concatenated, compressed
help database into its
components using their
original names. If the
database was created without
concatenation (the default),
HELPMAKE simply copies it to
a file with its original
name. No decompression
occurs.
/DU "Decode unformatted."
Decompresses the database
and removes all screen
formatting and
cross-references. The output
can still be used later for
input and recompression, but
all screen formatting and
cross-references are lost.
/H Displays a summary of HELPMAKE syntax
and exits without encoding or decoding
any files.
/HELP Invokes QH.EXE, the QuickHelp utility,
for information about HELPMAKE. If
QuickHelp is not available, displays the
same information as the /H option.
/Odestfile Specifies destfile for the decoded
output from HELPMAKE. If destfile is
omitted, the help database is decoded to
stdout. HELPMAKE always decodes help
database files into QuickHelp format.
/T Translates dot commands from internal
format into dot-command format. You
should always supply this option when
decoding a help database that contains
dot commands other than .context.
/V«n» Indicates the verbosity of diagnostic
and informational output depending on
the value of n. The possible values are
listed below. If you omit this option or
specify only /V, HELPMAKE gives you its
most verbose output.
Option Effect
────────────────────────────────────────────────────────────────────────────
/V Maximum diagnostic output
/V0 No diagnostic output and no
banner
/V1 Prints only the HELPMAKE
banner
/V2 Prints pass names
/V3 Prints contexts on first
pass
7.4 Creating a Help Database
You can create a Microsoft-compatible help database by either of two
methods.
The first method is to decompress an existing help database, modify the
resulting help text file, and recompress the help text file to form a new
database.
The second and simpler method is to append a new help database to an
existing help database. This method involves the following steps:
1. Create a help text file in QuickHelp format, RTF, or minimally
formatted ASCII.
2. Use HELPMAKE to create a help database file. The example below invokes
HELPMAKE, using SAMPLE.TXT as the input file and producing a help
database file named sample.hlp:
HELPMAKE /V /E /Osample.hlp sample.txt > sample.log
3. Make a backup copy of the existing database file (for safety's sake).
4. Append the new help database file to the existing help database. The
example below concatenates the new database sample.hlp onto the end
of the CLANG.HLP database:
COPY clang.hlp /b + sample.hlp /b
5. Test the database. The sample.hlp database contains the context
sample. If you type the word "sample" in the PWB and request help on
it, the help window displays the text associated with the context
sample.
7.5 Help Text Conventions
Microsoft help databases have a common structure and follow certain
organizational conventions. You should follow the same conventions to create
Microsoft-compatible help files.
7.5.1 Structure of the Help Text File
The help-retrieval capability that is built into Microsoft products is
simply a data-retrieval tool. It imposes no restrictions on the content and
format of the help text. The HELPMAKE utility and the display routines built
into Microsoft language environments, however, make certain assumptions
about the format of help text. This section provides some guidelines for
creating help text files compatible with those assumptions.
In all three help text formats, the help text source file is a sequence of
topics, each preceded by one or more unique context definitions. The
following list specifies the various formats and the corresponding context
definition statements:
Format Context Definition
────────────────────────────────────────────────────────────────────────────
QuickHelp .context context
RTF \ par >>context \ par
Minimally formatted >>context
ASCII (none)
In QuickHelp format, each topic begins with one or more .context statements
that define the context strings that map to the topic text. Subsequent lines
up to the next .context statement constitute the topic text.
In RTF format, each context definition must be in a paragraph of its own
(denoted by \ par), beginning with the help delimiter (>>). Subsequent
paragraphs up to the next context definition constitute the topic text.
In minimally formatted ASCII, each context definition must be on a separate
line, and each must begin with the help delimiter (>>). As in RTF and
QuickHelp files, subsequent lines up to the next context definition
constitute the topic text.
See Section 7.6, "Using Help Database Formats," for detailed information
about these three formats.
7.5.2 Local Contexts
Context strings that begin with an "at" sign (@) are defined as "local" and
have no implicit cross-references. They are used in cross-references instead
of the context string that otherwise is generated.
When you use a local context, HELPMAKE does not generate a global context
string (a context string that is known throughout the help file). Instead,
it embeds an encoded cross-reference that has meaning only within the
current context. For example,
.context normal
This is a normal topic, accessible by the context string "normal."
[button\v@local\v] is a cross-reference to the following topic.
.context @local
This topic can be reached only if the user browses
sequentially through the file or uses the cross-reference
in the previous topic.
In the example above, the text [button\v@local\v] defines local as a
local context. If the user selects the text [button] or scrolls through
the file, the help system displays the topic text that follows the context
definition for local. Because local is defined with the "at" sign (@), it
can be accessed only by a hyperlink within the help file or by sequentially
browsing through the file. Making a context local saves file space and
speeds access.
7.5.3 Context Prefixes
Microsoft help databases use several context prefixes. A "context prefix" is
a single letter followed by a period. It appears before a context string
that has a predefined meaning. If you decode a Microsoft help database, many
of these contexts may appear in the resulting text file.
Most context prefixes are internal.
Except for the h. prefix, which is described below, context prefixes are
internal. You do not need to add them in help files that you write.
You can use the h. prefix to identify standard help-file contexts. For
instance, h.default identifies the default help screen: the screen that
normally appears when you select "top-level" help. Table 7.1 lists the
standard h. contexts.
Table 7.1 Standard h. Contexts
╓┌─────────────────────────────────┌─────────────────────────────────────────╖
Context Description
────────────────────────────────────────────────────────────────────────────
h.contents The table of contents for the help file.
You should also define the string
"contents" for direct reference to this
context.
h.default The default help screen, typically
displayed when the user presses SHIFT+F1
at the "top level" in most applications.
The contents are generally devoted to
Context Description
────────────────────────────────────────────────────────────────────────────
The contents are generally devoted to
information about using help.
h.index The index for the help file. You can
also define the string "index" for
direct reference to this context.
h.notfound The help text that is displayed when the
help system cannot find information
about the requested context. The text
could be an index of contexts, a topical
list, or general information about using
help.
h.pg# A specific page within the help file.
This is used in response to a "go to
page #" request.
h.pg$ The help text that is logically last in
Context Description
────────────────────────────────────────────────────────────────────────────
h.pg$ The help text that is logically last in
the file. This is used by some
applications in response to a "go to the
end" request made within the help window.
h.pg1 The help text that is logically first in
the file. This is used by some
applications in response to a "go to the
beginning" request made within the help
window.
h.title The title of the help database.
────────────────────────────────────────────────────────────────────────────
The context prefixes in Table 7.2 are internal to Microsoft products. They
appear in decompressed databases, but you do not need to use them.
Table 7.2 Microsoft Product Context Prefixes
╓┌─────────────────────────────────┌─────────────────────────────────────────╖
Prefix Purpose
────────────────────────────────────────────────────────────────────────────
d. Dialog box. Each dialog box is assigned
a number. Its help context string is d.
followed by the number (for example,
d.12).
e. Error number. If a product supports the
error-numbering scheme used by Microsoft
languages, it displays help for each
error using this prefix. For example,
the context e.c1234 refers to the C
compiler error message number C1234.
m. Menu item. Contexts that relate to
product menu items are defined by their
accelerator keys. For example, the Exit
Prefix Purpose
────────────────────────────────────────────────────────────────────────────
accelerator keys. For example, the Exit
selection on the FILE menu item is
accessed by ALT+F X and is referenced in
help by m.f.x.
n. Message number. Each message box is
assigned a number. Its help context
string is n. plus the number (for
example, n.5 ).
────────────────────────────────────────────────────────────────────────────
7.5.4 Hyperlinks
Explicit cross-references, or hyperlinks, in the help text file are marked
with invisible text. A hyperlink comprises a word or phrase followed by
invisible text that gives the context to which the hyperlink refers.
The keystroke that activates the hyperlink depends on the application.
Consult the documentation for each product to find the specific keystroke
needed.
When the user activates the hyperlink, the help system displays the topic
named by the invisible text. The invisible cross-reference text is formatted
as one of the following:
Hyperlink Text Action
────────────────────────────────────────────────────────────────────────────
contextstring Causes the help topic associated with
contextstring to be displayed. For
example, exeformat results in the
display of the help topic associated
with the context exeformat.
filename! Treats filename as a single topic to be
displayed. For example,
$INCLUDE:stdio.h! searches the
INCLUDE environment variable for file
STDIO.H and displays it as a single help
topic.
filename!contextstring Works the same way as contextstring
above, except that only the help file
filename is searched for the context. If
the file is not already open, the help
system finds it (by searching either the
current path or an explicit environment
variable) and opens it. For example,
$BIN:readme.doc!patches searches for
readme.doc in the BIN environment
variable and displays the topic
associated with patches.
In the following example, the word Example is a hyperlink:
\bSee also:\p \uExample\p\vopen.ex\v
The hyperlink refers to open.ex. If you select any of the letters of
Example, the help system displays the topic whose context is open.ex. On
the screen, this line appears as follows:
See also: Example
An application might display See also: and Example in different colors
or character types, depending on such factors as your default color
selection and type of monitor.
When a hyperlink needs to cross-reference more than one word, you must use
an anchor, as in the following example:
\bSee also:\p \uExample\p\vprintf.ex\v, fprintf, scanf, sprintf,
vfprintf, vprintf, vsprintf
\aformatting table\vprintf.table\v
This part of the example is an anchored hyperlink:
\aformatting table\vprintf.table\v
Anchored hyperlinks must fit on a single line.
The \ a flag creates an anchor for the cross-reference. In the example, the
phrase following the \ a flag (formatting table) is the hyperlink. It refers
to the context printf.table. The first \v flag marks both the end of the
hyperlink and the beginning of the invisible text. The name printf.table
is invisible; it does not appear on the screen when the help is displayed.
The second \v flag ends the invisible text.
7.6 Using Help Database Formats
The text format of the database can be any of three types. The list below
briefly describes these types. Sections 7.6.1-7.6.3 describe the formatting
types in detail.
An entire help system (such as the one supplied with the Professional
Development System or QuickC) can use any combination of files formatted
with different format types. With C, for example, the README.DOC information
file is encoded as minimally formatted ASCII; and the help files for the C
language and run-time library are encoded in the QuickHelp format. The
database also cross-references the header (include) files, which are
unformatted ASCII files stored outside the database.
Type Characteristics
────────────────────────────────────────────────────────────────────────────
QuickHelp Uses dot commands and embedded
formatting characters (the default
formatting type expected by HELPMAKE);
supports highlighting, color, and
cross-references. This format must be
compressed before using.
Minimally formatted ASCII Uses a help delimiter (>>) to define
help contexts; does not support
highlighting, color, or crossreferences.
This format can be compressed, but
compression is not required.
RTF Uses a subset of standard RTF; supports
highlighting, color, and
cross-references; supports dot commands.
This format must be compressed before
using.
7.6.1 QuickHelp Format
The QuickHelp format uses a dot command and embedded formatting flags to
convey information to HELPMAKE.
QuickHelp Dot Commands
QuickHelp supports a number of dot commands, which identify topics and
convey other topic-related information to the help system. If your help file
contains dot commands other than .context, you must supply the /T option
when encoding and decoding with HELPMAKE.
You can define more than one context for a single topic.
The most important dot command is the .context command. Every topic in a
QuickHelp file begins with one or more .context commands. Each .context
command defines a context string for the topic text. You can define more
than one context for a single topic, as long as you do not place any topic
text between them.
Typical dot commands are shown below. The first defines a context for the
#include C preprocessor directive. The second set illustrates multiple
contexts for one block of topic text. In this case, the same topic text
explains all of the string-to-number conversion routines in C.
.context #include
.
.description of #include goes here
.
.context strtod
.context strtol
.context strtoul
.
. description of string-to-number functions goes here
.
The QuickHelp format supports several other dot commands. Table 7.3 lists
all of the dot commands available in QuickHelp format.
Table 7.3 QuickHelp Dot Commands
╓┌─────────────────────────────────┌─────────────────────────────────────────╖
Command Action
────────────────────────────────────────────────────────────────────────────
.category string Lists the category in which the current
topic appears and its position in the
list of topics. The category name is
used by the QuickHelp Topic command,
which brings up the list of topics to
which the current topic belongs. Some
applications, such as the PWB, use this
name as a pointer to the applicable
table of contents.
.command Indicates that the topic text is not a
displayable help topic. Use this command
to hide hyperlink topics and other
internal information. Hyperlink topics
are described in Section 7.5.5,
Command Action
────────────────────────────────────────────────────────────────────────────
are described in Section 7.5.5,
"Hyperlink Commands."
.comment string The string is a comment that appears
only in the help source file. Comments
are especially useful for documenting
the purpose of cross-references.
Because comments are not inserted in the
help database, they are not restored
when you decompress a help file.
.context string The string introduces a topic.
.end Ends a paste section. See the .paste
command below.
Command Action
────────────────────────────────────────────────────────────────────────────
.freeze numlines Indicates that the first numlines lines
should be frozen as the top line of the
help screen. This is normally used to
freeze a row of cross-reference buttons
at the top of a help topic that might be
scrolled.
.length topiclength Indicates the default window size, in
topiclength lines, of the topic about to
be displayed. This command is always the
first line in the topic if present.
.list Indicates that the current topic
contains a list of topics. QuickHelp
displays a highlighted line; you can
choose
a topic by moving the highlighted line
over the desired topic and pressing
Command Action
────────────────────────────────────────────────────────────────────────────
over the desired topic and pressing
ENTER. Help searches for the first word
of the line.
.mark name «column» Defines a mark immediately preceding the
following line of text. This command can
be used in help script commands to
indicate that the display of a
particular topic begins at the marked
line. The name identifies the mark. The
optional column value is an integer that
indicates a column location within the
specified line.
.next context Tells the help system to look up the
next topic using
context instead of the next topic's name.
You can use this command to skip large
blocks of .command or .popup topics.
Command Action
────────────────────────────────────────────────────────────────────────────
blocks of .command or .popup topics.
.previous context Tells the help system to look up the
previous topic using context instead of
the previous topic's name. You can use
this command to skip large blocks of
.command or .popup topics.
.paste pastename Begins a paste section. The pastename
appears in the QuickHelp Paste menu.
.popup Tells the help system to display the
current topic as a popup instead of a
normal, scrollable topic.
.ref string(s) Tells the help system to display the
list of string topics in the Reference
menu. You can list as many topics as
needed; separate each additional string
Command Action
────────────────────────────────────────────────────────────────────────────
needed; separate each additional string
with a comma.
.topic text Defines text as the name or title to be
displayed in place of the context string
if the application help displays a title.
This command is always the first line in
the context unless you also use the
.length command.
────────────────────────────────────────────────────────────────────────────
QuickHelp Formatting Flags
The QuickHelp format supports a number of formatting flags that are used to
highlight parts of the help database and to mark hyperlinks in the help
text.
Each formatting flag consists of a backslash ( \ ) followed by a character.
Table 7.4 lists the formatting flags.
Table 7.4 Formatting Flags
╓┌─────────────────────────────────┌─────────────────────────────────────────╖
Formatting Flag Action
────────────────────────────────────────────────────────────────────────────
\a Anchors text for cross-references
\b, \B Turns boldface on or off
\i, \I Turns italics on or off
\p, \P Turns off all attributes
\u, \U Turns underlining on or off
\v, \V Turns invisibility on or off (hides
Formatting Flag Action
────────────────────────────────────────────────────────────────────────────
\v, \V Turns invisibility on or off (hides
cross-references in text)
\\ Inserts a single backslash in text
────────────────────────────────────────────────────────────────────────────
On monochrome monitors, text labeled with the bold, italic, and underlining
attributes appears in various ways, depending on the application (for
example, high intensity and reverse video are commonly displayed). On color
monitors, these attributes are translated by the application into suitable
colors, depending on the user's default color selections.
The \b, \i, \u, and \v options are toggles, turning on and off their
respective attributes. You can use several of these on the same text. Use
the \p attribute to turn off all attributes. Use the \v attribute to hide
cross-references and hyperlinks in the text.
HELPMAKE truncates the lines in QuickHelp files to the width specified with
the /W option. (See Section 7.3.1, "Options for Encoding," for more
information.) Only visible characters count toward the character-width
limit. Lines that begin with an application-specific control character are
truncated to 255 characters regardless of the width specification. See
Section 7.3.1 for more information about application-specific control
characters.
In the example below, the \b flag initiates boldface text for Returns:, and
the \p flag that follows the word reverts to plain text for the remainder of
the line.
\bReturns:\p a handle if successful, or -1 if not.
errno: EACCES, EEXIST, EMFILE, ENOENT
In the example below, \a anchors text for the hyperlink Example . The \v
flags define the cross-reference to be sample_prog and cause the text
between the flags to be invisible. Cross-references are described in the
following section.
\aExample \vsample_prog\v
QuickHelp Cross-References
Help databases contain two types of cross-references: implicit
cross-references and explicit cross-references. They are described in
Section 7.1.1, "Contents of a Help File."
An implicit cross-reference is any word that appears both in the topic text
and as a context in the help file. For example, any time you request help on
the word "close," the help window displays help on the close function. You
don't need to code implicit cross-references in your help text files.
Insert formatting flags to mark explicit cross-references.
Explicit cross-references (hyperlinks) are words or phrases on the screen
that are associated with a context. For example, the word "Example" in the
initial help-screen area for any C function is an explicit cross-reference
to the C program example for that function. You must insert formatting flags
in your help text files to mark explicit cross-references.
If the hyperlink consists of a single word, you can use invisible text to
flag it in the source file. The \v formatting flag creates invisible text,
as follows:
hyperlink\vcontext\v
Specify the first \v flag immediately following the word you want to use as
the hyperlink. Following the flag, insert the context that the hyperlink
crossreferences. The second \v flag marks the end of the context; that is,
the end of the invisible text. HELPMAKE generates a cross-reference whose
context is the invisible text, and whose hyperlink is the entire word.
If the hyperlink consists of a phrase, rather than a single word, you must
use anchored text to create explicit cross-references. Use the \a and \v
flags to create anchored text as follows:
\ahyperlink-words\vcontext\v
The \a flag marks an anchor for the cross-reference. The text that follows
the \a flag is the hyperlink. The hyperlink must fit entirely on one line.
The first \v flag marks both the end of the hyperlink and the beginning of
the invisible text that
contains the cross-reference context. The second \v flag marks the end of
the invisible text.
The following example contains three implicit cross-references to the C
routines abs, cabs, and fabs.
See also: abs, cabs, fabs
The following example shows the encoding for an explicit cross-reference to
an example program and a function template from the help database for the C
run-time library:
See also: Example\vopen.ex\v, Template\vopen.tm\v, close
Here, the hyperlinks are Example and Template, which reference the
contexts open.ex and open.tm. The example also contains an implicit
cross-reference to the close function.
The following example shows the encoding for an explicit cross-reference to
an entire family of functions:
See also: \ais... functions\vis_functions\v, atoi
The cross-reference uses anchored text to associate a phrase, rather than
just a word, with a context. In this example, the hyperlink is the anchored
phrase is... functions, and it cross-references the context is_functions.
In addition, the example contains an implicit cross-reference to the atoi
routine.
The code below is an example in QuickHelp format that contains a single
entry:
.context open
.length 13
\bInclude:\p <fcntl.h>, <io.h>, <sys\\types.h>, <sys\\stat.h>
\bPrototype:\p int open(char *path, int flag[, int mode]);
flag: O_APPEND O_BINARY O_CREAT O_EXCL O_RDONLY
O_RDWR O_TEXT O_TRUNC O_WRONLY
(can be joined by |)
mode: S_IWRITE S_IREAD S_IREAD | S_IWRITE
\bReturns:\p a handle if successful, or -1 if not.
errno: EACCES, EEXIST, EMFILE, ENOENT
\bSee also:\p \uExample\p\vopen.ex\v, \uTemplate\p\vopen.tp\v,
access, chmod, close, creat, dup, dup2, fopen, sopen, umask
The .length command near the beginning of the example specifies the size of
the initial window for the help text. Here, the initial window displays 13
lines.
The manifest constants (such as O_WRONLY and EEXIST), the C keywords (such
as int and char), and the other functions (such as sopen and access) are
implicit cross-references. The words Example and Template are explicit
cross-references to the example open.ex and to the open template open.tp,
respectively. Note the use of double backslashes in the include file names.
7.6.2 Minimally Formatted ASCII Format
A minimally formatted ASCII text file comprises a sequence of topics, each
preceded by one or more unique context definitions. Each context definition
must be on a separate line beginning with a help delimiter (>>). Subsequent
lines up to the next context definition constitute the topic text.
Minimally formatted ASCII files cannot contain highlighting.
Minimally formatted ASCII files can be used in two ways. You can compress
the file with HELPMAKE, creating a help database, or an application can
access the uncompressed file directly. Uncompressed files are somewhat
larger and slower to search, however. Minimally formatted ASCII files are of
fixed width, and they cannot contain highlighting (or other nondefault
attributes) or cross-references.
The following example, coded in minimally formatted ASCII, shows the same
text as the QuickHelp example in the previous section. The first line of the
example defines open as a context string. The minimally formatted ASCII
help file must begin with the help delimiter (>>), so that HELPMAKE or the
application can verify that the file is indeed an ASCII help file.
>>>>open
Include: <fcntl.h>, <io.h>, <sys\types.h>, <sys\stat.h>
Prototype: int open(char *path, int flag[, int mode]);
flag: O_APPEND O_BINARY O_CREAT O_EXCL O_RDONLY
O_RDWR O_TEXT O_TRUNC O_WRONLY
(can be joined by |)
mode: S_IWRITE S_IREAD S_IREAD | S_IWRITE
Returns: a handle if successful, or -1 if not.
errno: EACCES, EEXIST, EMFILE, ENOENT
See also: access, chmod, close, creat, dup, dup2, fopen, sopen, umask
When displayed, the help information appears exactly as it is typed into the
file. Any formatting codes are treated as ASCII text. Note that you do not
need to escape backslashes in minimally formatted ASCII files.
If you compress minimally formatted ASCII files, they are smaller and faster
to search.
7.6.3 Rich Text Format (RTF)
RTF is a Microsoft word-processing format supported by many other word
processors. It allows documents to be transferred from one application to
another without losing any formatting information. The HELPMAKE utility
recognizes a subset of the full RTF syntax. If your file contains any RTF
code that is not part of the subset, HELPMAKE ignores the code and strips it
out of the file.
Certain word-processing and file-conversion programs generate the RTF code
automatically as output. You need not worry about inserting RTF codes
yourself; you can simply format your help files directly with a
word-processor that generates RTF, using the attributes supported by the
subset. The only items you need to insert are the help delimiter (>>) and
context string that start each entry.
HELPMAKE recognizes the subset of RTF listed below:
RTF Code Action
────────────────────────────────────────────────────────────────────────────
\b Boldface. The application decides how to
display this; often it is intensified
text.
\fi <nnn> Paragraph first-line indent.
\i Italic. The application decides how to
display this; often it is reverse video.
\li <nnn> Paragraph indent from left margin.
\line New line (not new paragraph).
\par End of paragraph.
\pard Default paragraph formatting.
\plain Default attributes. On most screens this
is nonblinking normal intensity.
\tab Tab character.
\ul Underline. The application decides how
to display this; some adapters that do
not support underlining display it as
blue text.
\v Hidden text. Hidden text is used for
cross-reference information and for some
application-specific communications; it
is not displayed.
Using the word-processing program, you can break the topic text into
paragraphs. When HELPMAKE compresses the file, it formats the text to the
width given with the / W option, ignoring the paragraph formats.
As with the other text formats, each entry in the database source consists
of one or more context strings, followed by topic text. An RTF file can
contain QuickHelp dot commands.
The help delimiter (>>) at the beginning of any paragraph denotes the
beginning of a new help entry. The text that follows on the same line is
defined as a context for the topic. If the next paragraph also begins with
the help delimiter, it also defines a context string for the same topic
text. You can define any number of contexts for a block of topic text. The
topic text comprises all subsequent paragraphs up to the next paragraph that
begins with the help delimiter.
The code below is an example of a help database that contains a single entry
using subset RTF text. Note that RTF uses curly braces ({}) for nesting.
Thus, the entire file is enclosed in curly braces, as is each specially
formatted text item.
{\rtf1
\pard >>open\par
{\b Include:} <fcntl.h>, <io.h>, <sys\\types.h>, <sys\\stat.h>\par
\par
{\b Syntax:} int open( char * filename, int oflag[, int pmode ]
);\par
oflag: O_APPEND O_BINARY O_CREAT O_EXCL O_RDONLY\par
O_RDWR O_TEXT O_TRUNC O_WRONLY\par
(may be joined by |)\par
pmode: S_IWRITE S_IREAD S_IREAD | S_IWRITE\par
\par
{\b Returns:} a handle if successful, or -1 if not.\par
errno: EACCES, EEXIST, EMFILE, ENOENT\par
\par
{\b See also:} Examples{\v open.ex}, access, chmod, close, creat,
dup,\par
dup2, fopen, sopen, umask\par
>>open.ex\par
To build this help file, use the following command:\par
\par
HELPMAKE /S1 /E15 /OOPEN.HLP OPEN.RTF\par
\par
< Back >{\v !B}
}
Actual RTF output normally contains additional information that is not
visible to the user; HELPMAKE ignores this extra information.
Chapter 8 Customizing the Microsoft Programmer's WorkBench
────────────────────────────────────────────────────────────────────────────
Designed with flexibility in mind, the Microsoft Programmer's WorkBench
(PWB) provides a highly extensible development platform for the Microsoft C
Professional Development System. Using PWB it is easy to change basic
environment features such as screen colors and key assignments, and you can
add powerful new functions of your own using macros and C-language
extensions.
This chapter explains four methods for customizing the Programmer's
WorkBench: setting switches, assigning keystrokes, writing macros, and
writing C extensions. While it explains customization methods, the chapter
does not document every customizable feature of the Programmer's WorkBench.
Use on-line help as your primary source of information about these and other
PWB features.
This chapter assumes you are familiar with basic PWB operations and
terminology. If you are not, read "Using the Programmer's WorkBench" in
Installing and Using the Microsoft C Professional Development System.
8.1 Setting Switches
The Programmer's WorkBench has a number of "switches," or user-configurable
options, that control features such as screen colors. Each switch has a name
and can be assigned a value.
There are two ways to set PWB switches. The easiest way is by choosing
Editor Settings in the Options menu. You can also edit the TOOLS.INI
initialization file. These methods can also be used for more elaborate
customizations, such as writing macros.
8.1.1 Editing the <assign> Pseudofile
If you choose Editor Settings in the Options menu, PWB changes to the
<assign> pseudofile and displays it in the current window. (A pseudofile is
constructed dynamically by PWB; it exists only in memory.) The <assign>
file lists all the current PWB settings.
To change a switch, edit the line where it appears. For instance, the
vscroll switch controls how many lines PWB scrolls vertically; its default
setting is 1. To change it, move to the corresponding line:
vscroll:1
Change the 1 to 3 and move the cursor to another line. PWB highlights the
line to indicate the change is legal. (If you make an illegal change, PWB
signals an error.) The change takes effect immediately: now PWB scrolls text
three vertical lines at a time.
If you don't explicitly save a change, it disappears at the end of the
current session. You can save a change by saving <assign> as you would any
other file (by pressing ALT+A ALT+A F2). When you exit PWB, you are asked if
you want to save TOOLS.INI, the PWB initialization file, which records
customizations. Answer yes (type Y) to save the change.
You can also use this method for more elaborate customizations, such as
writing macros (see Section 8.3, "Writing Macros"). Simply insert a few
blank lines in <assign> and enter the new information in them. Note that PWB
only pays attention to lines you change or add to <assign>. Deleting a line
has no effect.
8.1.2 Editing the TOOLS.INI Initialization File
Another way to customize PWB is by editing TOOLS.INI, the initialization
file used by PWB and other Microsoft language tools. This method is useful
if you customize PWB extensively.
While the <assign> file lists every customizable PWB item, the TOOLS.INI
file contains lines only for items you have customized. Those items not
mentioned in TOOLS.INI are set to a default value.
Dividing TOOLS.INI into Sections
Since several tools can use TOOLS.INI, the file may contain information that
doesn't relate to PWB. If you customize more than one tool, TOOLS.INI is
divided into sections, one for each tool. Each section begins with a tag
consisting of the tool's base name enclosed in square brackets: [PWB] for
PWB.EXE, [NMAKE] for NMAKE.EXE, and so on.
For example, say you set the vscroll switch to 3 and save the change, but
you have not customized PWB in any other way. Your TOOLS.INI file will
contain this section:
[PWB]
vscroll:3
Settings following this tag are put in effect by PWB every time it starts.
You can also create sections of TOOLS.INI that PWB reads only in certain
circumstances. You can create sections for different video adapters,
file-name extensions, and operating system versions.
If you use more than one video display, TOOLS.INI can have a different
section for each display:
■ [PWB-mono]
■ [PWB-cga]
■ [PWB-ega]
■ [PWB-vga]
After each tag, you can set different screen colors, dimensions, and other
display-specific switches.
You can also create a section for files with specific extensions. For
instance, your TOOLS.INI file could contain a section beginning with the tag
[PWB-.C]
for C source files, and
[PWB-.ASM]
for assembly-language (.ASM) source files. Each time you load a file with
the designated extension, PWB reads the appropriate section of TOOLS.INI.
For each file type, you could use a different set of macros and other
customizations.
TOOLS.INI can also contain sections specific to operating system versions.
The following tag introduces a section specific to DOS version 3.20, for
instance:
[PWB-3.20]
You can combine tags as needed. For example, the tag
[PWB-3.20 PWB-10.10R]
applies to DOS version 3.20 and OS/2 version 1.1 real mode.
You can also create a section in TOOLS.INI containing switches for a
userwritten extension. See Section 8.4.3, "Describing Functions and
Switches." On-line help contains additional information about TOOLS.INI
tags.
8.2 Assigning Keystrokes
PWB allows you to assign any editing function to almost any keystroke.
Reassigning keystrokes doesn't change PWB graphic interface, however.
Keystrokes, like switches, are listed in the <assign> pseudofile (choose Key
Assignments in the Options menu) and can be changed there. For example, say
you want to assign the home cursor function to the SHIFT+HOME keystroke. The
default keystroke assignment for home is:
home:ctrl+home
If you change the assignment to
home:shift+home
SHIFT+HOME moves the cursor to the home (upper left) window position.
It is legal to assign more than one keystroke to the same function. For
example, many keystrokes invoke the select function, which selects a text
region. Thus, the previous example adds a new keystroke (SHIFT+HOME) for the
home function; it does not remove the previous assignment (CTRL+HOME).
There are two limitations on keystroke assignments:
■ You can't reassign a keystroke that PWB is using for a menu. For
instance, if ALT+F pulls down the File menu, PWB ignores any attempt
to reassign ALT+F.
■ You can't reassign ALT plus the number keys 1 - 9 (ALT+1, ALT+2, and
so on). These keystrokes are reserved for the file history menu items.
Each keystroke can only invoke one function. If you mistakenly assign a
key-stroke to more than one function, PWB uses the most recent assignment.
For example,
home:ctrl+a
setfile:ctrl+a
assigns the CTRL+A keystroke to two different functions, home and setfile.
The second assignment overrides the first, assigning CTRL+A to setfile.
Occasionally, you may want to "unassign," or disable, a keystroke. This is
done by assigning the unassigned function to the keystroke. For example,
unassigned:ctrl+a
disables CTRL+A. PWB signals an error when you press any unassigned key.
8.3 Writing Macros
The fastest way to create a new editing function for PWB is to write a
macro. The function can be as simple as inserting a long word or phrase, or
it can perform complex tasks by invoking PWB functions and other macros.
8.3.1 Macro Syntax
A macro can contain any combination of PWB functions, literal text, and
macro operators. You can define as many as 1,024 macros at one time.
Literal text is case sensitive.
Literal text is anything inside double quotes. Inside literal text, you can
represent a double quote as \" and a backslash as \\. Text is case
sensitive inside quotes and case insensitive outside them.
The following macro comments out a line of C source code:
comment:=begline "/* " endline " */"
comment:alt+c
The first line names the macro and tells what it does. The begline and
endline editor functions move the cursor, while the text inside quotes is
printed at the current cursor position. The second line assigns a keystroke
(ALT+C) to the macro.
A macro definition must fit on one logical line. If necessary, you can use
the backslash ( \ ) to continue the definition on the next line. For
instance, the definition
comment:=begline "/* " endline " */"
could be written as
comment:=begline \
"/* " endline \
" */"
Notice the extra space before each backslash. If you want a space between
the end of one line and the beginning of the other, you must precede the
backslash with two spaces.
You can use the arg function to pass arguments to functions. For example,
the following macro passes the argument 15 to the plines function (which
scrolls text down):
movedown:=arg "15" plines
Because arg precedes the literal text, the text doesn't appear on the
screen. Instead, it is passed as an argument to the next function, plines.
The macro scrolls the current text down 15 lines.
Arguments can use regular expression syntax, as well (regular expressions
are documented in on-line help):
endword:=arg arg "( !.!$!\\:!;!\\)!\\(!,)" psearch
The arg arg sequence directs the psearch function to treat the text argument
as a regular expression search pattern. This search pattern tells PWB to
search for the next period, end of line ($), colon, semicolon, close
parenthesis, open parenthesis, or comma.
A macro can invoke other macros:
lcomment:= "/* "
rcomment:= " */"
commentout:=begline lcomment endline rcomment
commentout:alt+z
The commentout macro invokes the previously defined macros lcomment and
rcomment.
In addition to standard PWB functions, macros can invoke user-defined
(extension) functions. See Section 8.4, "Writing and Building C Extensions."
8.3.2 Macro Responses
Some PWB functions ask you for confirmation. For example, the meta exit
(quit without saving) function normally asks if you really want to exit.
Such questions always take the answer "yes" (y) or "no" (n).
When you invoke such a function in a macro, the function assumes an answer
of yes and does not ask for confirmation. For example, the macro definition
quit:=meta exit
quit:alt+x
invokes meta exit when you press ALT+X. Because the meta exit function is
invoked from a macro, PWB exits without asking for confirmation.
The following operators allow you to restore normal prompting or change the
default responses:
Operator Description
────────────────────────────────────────────────────────────────────────────
< Asks for confirmation; if not followed
by another < operator, prompts for all
further questions
<y Assumes a response of yes
<n Assumes a response of no
A response operator applies to the function immediately preceding it. For
instance, you can add the operator to the quit macro definition to
restore the usual prompt:
quit:=meta exit <
quit:alt+x
Now the macro prompts for a response before it exits.
8.3.3 Macro Arguments
If you enter an argument in PWB and then invoke a macro, the argument is
passed to the first function in the macro that takes an argument:
tripleit:=copy paste paste
The tripleit macro invokes the copy and paste editing functions. If you
highlight a text area and then invoke the macro, your highlighted argument
is passed to the copy function, which copies the argument to the clipboard.
The macro then invokes paste twice. The effect is to insert two copies of
the highlighted text.
You cannot pass more than one argument from PWB to a macro.
You cannot pass more than one argument from PWB to a macro, even if the
macro invokes more than one function that can accept an argument. The
argument always goes to the first function in the macro that takes an
argument.
You can also prompt for input inside a macro and pass the input as an
argument using the prompt function as shown below:
newfile:=arg "Next file: " prompt setfile <
newfile:alt+n
The newfile macro prompts for a file name and then switches to the
specified file. The sequence arg "Next file: " passes a text argument to
prompt, which prints the text on the dialog line and waits for input. The
input is passed as a text argument to the setfile function, which switches
to that file. For more information on the prompt function, see on-line help.
8.3.4 Macro Conditionals
Macros can take different actions depending on certain conditions. Such
macros take advantage of the fact that PWB editing functions generally
return values─a TRUE (nonzero) value if successful or FALSE (zero) if
unsuccessful.
Macros can use four conditional operators:
Operator Description
────────────────────────────────────────────────────────────────────────────
:>label Defines a label that can be targeted by
other operators
=>label Jumps to label
+>label Jumps to label if the previous function
returns TRUE
->label Jumps to label if the previous function
returns FALSE
For example, the leftmarg macro moves the cursor to the left margin of the
editing window:
leftmarg:=:>leftmore left +>leftmore
The macro above invokes the left function repeatedly (jumping to the label
leftmore) until it returns FALSE, indicating the cursor has reached the left
margin.
The label must appear immediately after the conditional operator, with no
intervening spaces. A conditional operator without a label exits the macro
immediately if the condition is true. If the condition is false, the macro
continues execution. The following example demonstrates this:
turnon:=insertmode +> insertmode
This macro turns on insert mode regardless of whether insert mode is
currently on or off. If insert mode is off, the first invocation of
insertmode toggles the mode on and returns TRUE, causing the +> operator to
terminate the macro. If insert mode is currently on, the first invocation of
insertmode turns insert mode off and returns FALSE. The macro then invokes
insertmode a second time, turning insert mode back on.
8.3.5 Temporary Macros
Occasionally, you may want to create a macro that lasts only through the
current session. This can be done with the assign function. For example, the
following steps create the comment macro described above.
To create the macro:
■ Press ALT+A
■ Type comment:=begline "/* " endline " */"
■ Press ALT+=
To assign the ALT+C keystroke to the macro:
■ Press ALT+A
■ Type comment:alt+c
■ Press ALT+=
The macro is available immediately and then disappears at the end of the
current session.
8.3.6 Macro Recordings
Another way to create a macro is by recording your own actions. The entire
sequence of actions is saved and can be replayed later by pressing a key.
You start the recording by invoking the record function. PWB names the
resulting macro recordvalue by default, but you can use other names as well.
To record a macro:
■ Choose Record On from the Edit menu to start the recording.
■ Perform the actions you want to record.
■ Choose Record On again to end the recording.
■ If recordvalue is not already assigned, assign it to a keystroke as
described above.
After you complete these steps, a macro named recordvalue is available
through the keystroke you assigned in the last step above. When you press
this key, PWB replays the actions you recorded.
If you don't do anything more, the recorded macro is temporary─it disappears
when you exit PWB. To save the macro permanently:
■ Open the <record> pseudofile (press ALT+A, type <record>, press F2).
■ Copy the macro definition in <record>.
■ Paste the definition into the [PWB] section of your TOOLS.INI file.
Studying recorded macros can teach you a lot about macros and editor
functions. If you open the <record> pseudofile in a second window before you
record, you can watch PWB write the macro definition function by function.
If you save a recorded macro, you'll want to name it something other than
recordvalue, the default name. To do this, pass the new name as an argument
when you start the recording:
■ Press ALT+A ALT+A.
■ Type the new name.
■ Choose Record On from the Edit menu to start recording.
■ Complete the recording as usual.
You can expand an existing macro using the same process. If you supply the
name of an existing macro, PWB appends the recorded commands to the macro
instead of replacing it.
You can record a series of actions without executing them.
You can also make a "silent" recording, which records a series of actions
without executing them. Start the recording with a meta record command
(press F9 SHIFT+CTRL+R). Then complete the recording process as described
above.
8.4 Writing and Building C Extensions
An "extension" is a file containing one or more user-written functions. PWB
loads extensions at run time. Once the extension has been loaded, its
functions can be assigned their own keystrokes, given arguments, and invoked
in macros, exactly like other PWB functions.
User-written functions execute more quickly than macros.
The ability to load and call user-written functions makes PWB highly
extensible. Because they consist of compiled C code, your functions can
perform more complex jobs than macros can, and they execute many times
faster.
An extension contains executable code, but it differs from a normal
executable file in some important ways:
■ It does not contain the usual C start-up code.
■ It contains special data structures that describe its functions to
PWB.
■ Its functions are declared in a form that allows PWB to call them and
pass arguments to them.
■ Its functions can call native PWB functions, and some, but not all, C
library functions.
This section explains how to build, load, and invoke a PWB extension. The
example, CENTER.C, serves as a basis for discussion throughout the rest of
this chapter.
The CENTER.C extension contains one extension function, CenterLine, which
centers a line or range of lines in the current file.
/* CENTER.C: Sample PWB extension */
#define LINE_LENGTH 80 /* Assumes 80-column screen */
#include <string.h>
/* PWB extension header file */
#include "ext.h"
PWBFUNC CenterLine( unsigned argData,
ARG _far *pArg,
flagType fMeta );
/* Switch Table */
struct swiDesc swiTable[] =
{
{ NULL, NULL, 0 }
};
/* Command Table */
struct cmdDesc cmdTable[] =
{
{ "CenterLine", CenterLine, 0, NOARG | LINEARG },
{ NULL, NULL, 0, 0 }
};
/* Initialization Function */
void EXTERNAL WhenLoaded( void )
{
DoMessage( "Loading Center extension" );
}
/* Extension (user-written) function */
PWBFUNC CenterLine( unsigned argData,
ARG _far *pArg,
flagType fMeta )
{
PFILE pFile;
LINE yStart, yEnd;
int len;
char *pBuf, buf[BUFLEN];
/* Get a handle to the current file */
pFile = FileNameToHandle( "", "" );
/* Handle various argument types */
switch( pArg->argType )
{
case NOARG: /* No argument. Center current line */
yStart = yEnd = pArg->arg.noarg.y;
break;
case LINEARG: /* Center range of lines */
yStart = pArg->arg.linearg.yStart;
yEnd = pArg->arg.linearg.yEnd;
break;
}
/* Center current line or range of lines */
for( ; yStart <= yEnd; yStart++ )
{
/* Get a line from the current file */
len = GetLine( yStart, buf, pFile );
if( len > 0 )
{
/* Center the text in this line */
pBuf = buf + strspn( buf, " \t" );
len = strlen( pBuf );
memmove( buf+(LINE_LENGTH-len) / 2, pBuf, len+1 );
memset( buf, ' ', (LINE_LENGTH - len) / 2 );
/* Write modified line back to the current file */
PutLine( yStart, buf, pFile );
}
}
return TRUE;
}
Building and using a PWB extension involves four basic steps:
1. Compiling
2. Linking
3. Loading the extension into PWB
4. Assigning a keystroke to each function in the extension
You can build extensions for both real mode (DOS) and OS/2 protected mode.
8.4.1 Building Real-Mode Extensions
This section describes how to build extensions for real mode.
Compiling
The source (.C) file for an extension must include EXT.H, the extension
header file. Since an extension is not a stand-alone executable file, it
doesn't have a main function; so its source file is compiled with the /c
(compile, but don't link) option:
CL /c /Gs /ACw CENTER.C
The /Gs option turns off stack checking; the /ACw option selects the
required custom memory model.
PWB extension interface is designed for C programmers. However, you can
write extensions in assembly language or other languages if you simulate the
required C memory model (in which SS is not assumed to equal DS).
Linking
The first object file in the link command must be the stub EXTHDR.OBJ:
link exthdr center, center.mxt;
PWB can load a file with any name, but most programmers use the .MXT
extension to distinguish a PWB extension from a normal .EXE file.
Loading the Extension
Once the extension is built, you can cause PWB to load it by adding a load
command to your TOOLS.INI file:
load:center
You don't need to supply a file extension; PWB assumes the correct file
extension. To specify a path, supply the path name preceded by a dollar sign
($):
load:$INIT:center
The example tells PWB to search the directories specified in the INIT
environment variable. If listed, the environment variable must be in
uppercase.
TOOLS.INI can contain multiple load commands for different extensions.
However, loading each extension involves a certain amount of memory
overhead, and there is no way to unload an extension from memory. To
conserve memory, place all frequently used functions in a single extension
and load only that extension.
Assigning Keystrokes to Functions
After an extension has been loaded, you must provide some way to invoke its
functions from inside PWB. A keystroke is the most common means, although
extension functions, like native PWB functions, can be invoked in various
ways.
You can assign the ALT+C keystroke to the CenterLine function with:
CenterLine:alt+c
Once the CenterLine function has been assigned to this keystroke, you can
invoke it by pressing ALT+C.
8.4.2 Building Protected-Mode Extensions
The build process for OS/2 protected mode differs only slightly from the
real-mode build process.
Compiling
The source (.C) file for an extension must include EXT.H, the extension
header file. Since an extension is not a stand-alone executable file, it
doesn't have a main function; so its source file is compiled with the /c
(compile, but don't link) option:
CL /c /Gs /ACw CENTER.C
The /Gs option turns off stack checking; the /ACw option selects the
required custom memory model.
PWB extension interface is designed for C programmers. However, you can
write extensions in assembly language or other languages if you simulate the
required C memory model (in which SS is not assumed to equal DS).
Linking
Link with EXTHRDP.OBJ instead of EXTHDR.OBJ. Specify the .PXT extension for
the output file. List the EXT.DEF definitions file:
link exthdrp center, center.pxt,, os2, ext.def
Loading the Extension
In protected mode, PWB assumes the .PXT file extension. If your extension is
not found, PWB assumes the .DLL file extension.
You cannot create a bound extension.
There is no way to create a bound extension (one that runs in both real and
protected mode). However, you can build separate versions of an extension
and use a single TOOLS.INI load command to load the correct extension in
each mode. PWB loads the real-mode file (.MXT) in real mode and the
protected-mode file (.PXT or .DLL) in protected mode.
Assigning Keystrokes to Functions
After an extension has been loaded, you must provide some way to invoke its
functions from inside PWB. A keystroke is the most common means, although
extension functions, like native PWB functions, can be invoked in various
ways.
You can assign the ALT+C keystroke to the CenterLine function with:
CenterLine:alt+c
Once the CenterLine function has been assigned to this keystroke, you can
invoke it by pressing ALT+C.
8.4.3 Describing Functions and Switches
To call functions in your extension, PWB must know certain information about
each function, such as the name and address of the function, what types of
arguments it accepts, and what switches (if any) it employs. You provide
this information in a pair of arrays─cmdTable and swiTable─that must be
present in every PWB extension.
The cmdTable Array
Every extension must contain an array of structures named cmdTable. This
array provides the information PWB needs to call the extension's functions.
The cmdTable array is an array of structures of type cmdDesc (which is
declared in EXT.H). Each structure in the array describes one function in
the extension. The array is terminated with a structure whose members are
all null.
For instance, the CENTER.C extension has one function, named CenterLine, so
its cmdTable array contains two structures (one for CenterLine and the
other to terminate the table):
struct cmdDesc cmdTable[] =
{
{ "CenterLine", CenterLine, 0, NOARG | LINEARG },
{ NULL, NULL, 0, 0 }
};
Each cmdDesc structure in cmdTable contains these members:
■ The function's name
■ The function's address
■ Reserved item (must be 0)
■ The argument types the function accepts
The last member in the list is an integer containing bitflags representing
types of arguments that your function accepts. You can combine more than one
bitflag using the OR ( | ) operator.
For instance, the CenterLine function can handle an argument of the type
LINEARG, or no arguments (NOARG). So it lists the types:
NOARG | LINEARG
There are many argument types in addition to these. For information about
specific argument types, see the Extensions topic in on-line help.
The swiTable Array
Extension functions, such as native PWB functions, can respond to user-
configurable switches. From the viewpoint of an extension function, a switch
is usually a variable that the user can change at run time. Your function
must be ready to respond to these changes, and PWB must have some way to
convey them. The vehicle for this interchange is an array of structures
named swiTable.
The swiTable array is similar to the cmdTable array described above. It is
an array of structures, terminated by a structure whose members are all
null. Each structure in swiTable describes one switch used by a function in
your extension.
The CENTER.C extension doesn't take any switches, so its swiTable array only
contains a terminating null structure:
struct swiDesc swiTable[] =
{
{ NULL, NULL, 0 }
};
Each structure in swiTable is of type swiDesc, whose members are
■ A pointer to the switch name
■ A pointer to the switch or a function
■ A flag that indicates the type of the switch
A switch can be one of three types: SWI_BOOLEAN for TRUE/FALSE conditions,
SWI_NUMERIC for numerics, or SWI_SPECIAL for strings.
The second member of swiDesc is a pointer. It points to the switch itself if
the switch is type SWI_BOOLEAN or SWI_NUMERIC, or to a string-handling
function if the switch is type SWI_SPECIAL.
For instance, the following code creates a numeric switch with the default
value 27:
static int n = 27;
struct swiDesc swiTable[] =
{
{ "newswitch", &n, SWI_NUMERIC | RADIX10 },
{ NULL, NULL, 0 }
};
The first structure in the example above contains the name of the switch
("newswitch"), a pointer to the variable that contains the switch's value
(&n), and the switch's type (SWI_NUMERIC).
In this example, the third structure member contains another constant,
RADIX10. If a switch is type SWI_NUMERIC, you must supply a second constant
to tell PWB whether to interpret user-assigned values as decimal (RADIX10)
or hexadecimal (RADIX16) numbers.
If the switch is type SWI_SPECIAL, the second member of swiDesc is a pointer
to an additional string-handling function that you write. This function must
be of type int far _pascal. Each time the text switch changes, PWB calls
your function, passing it the address of the updated string as a char far
pointer. The following code stores the updated string in a buffer named
mystring:
char mystring[BUFLEN];
int far _pascal setstr( char far *ptr )
{
strcpy( mystring, ptr );
}
If desired, you can list switches for extension functions separately from
other switches. Whenever PWB loads an extension, it looks in TOOLS.INI for a
section with this form:
[PWB-ext]
where ext is the base name of the extension. If the extension exists, PWB
recognizes the settings immediately following the tag. For instance, if your
extension SAMPLE.MXT uses a numeric switch named numbills, you can set
numbills to the value 66 with:
[PWB-SAMPLE]
numbills:66
8.4.4 Initializing Functions
Every PWB extension must contain a function named WhenLoaded, which PWB
calls immediately after loading the extension. The WhenLoaded function
provides a chance to do any initialization that your functions require. (If
your functions don't need any initialization, they can simply return.)
The CENTER.C extension uses WhenLoaded to display a loading message:
void EXTERNAL WhenLoaded( void )
{
DoMessage( "Loading Center extension" );
}
DoMessage is a PWB function that displays a message on the dialog line.
Section 8.4.7, "Calling PWB Functions," lists PWB functions and explains how
to call them.
8.4.5 Prototyping Functions
To be called by PWB, each extension function must be declared as type
PWBFUNC and accept the parameters argData, pArg, and fMeta. The CenterLine
function in the section of CENTER.C code below follows this model:
PWBFUNC CenterLine( unsigned argData,
ARG _far *pArg,
flagType fMeta )
The PWBFUNC type is actually a macro that evaluates to flagType _pascal
_loadds _far. The flagType return type declares that the function returns
either TRUE (nonzero) or FALSE (zero). Your function should return a value
so that it can be used in a macro with conditionals. The modifiers _pascal,
_loadds, and _far specify the calling conventions PWB expects editor
functions to have.
8.4.6 Receiving Parameters
Like native PWB functions, extension functions can receive parameters from
the user. The CENTER.C example allows you to select a range of lines to
center, for example. The selected range is passed as a parameter to the
CenterLine function.
Extension functions receive parameters in much the same way ordinary C
programs receive command-line parameters. In both cases, the parameters are
passed in a predefined data construct─argc and argv for a normal C program,
and the following parameters for an extension function:
Parameter Description
────────────────────────────────────────────────────────────────────────────
argData The keystroke used to invoke your
function
pArg A pointer to a structure containing
arguments passed to your function
fMeta TRUE (nonzero) if meta precedes the
argument, otherwise FALSE (zero)
The first parameter is rarely used. Most extension functions receive all
their parameter data in the second parameter, pArg. This parameter is a
pointer to a structure of type ARG, which contains:
Parameter Description
────────────────────────────────────────────────────────────────────────────
argType An integer that indicates the argument
type
arg A union of structures, one structure for
each
argument type
Typically, your function tests pArg->argType to find out what type of
parameter PWB has passed. Once the type is known, the function responds
accordingly. The following code from CENTER.C handles two argument types:
switch( pArg->argType )
{
case NOARG: /* No argument. Center current line */
yStart = yEnd = pArg->arg.noarg.y;
break;
case LINEARG: /* Center range of lines */
yStart = pArg->arg.linearg.yStart;
yEnd = pArg->arg.linearg.yEnd;
break;
}
PWB rejects invalid arguments.
If your function takes only one argument, it doesn't need to test
pArg->argType at all. PWB knows beforehand what argument types your function
accepts (via cmdDesc) and rejects any invalid arguments.
Once the argument type is known, your function can access the parameters
through pArg->arg, a structure whose members differ for each argument type.
In the NOARG (no arguments) case, it contains x and y values identifying the
cursor position in the current file:
struct noargType
{ /* no argument */
LINE y; /* cursor line */
COL x; /* cursor column */
};
The CENTER.C example uses the y value in this structure (noarg.y, the cursor
line) to center the current line:
case NOARG: /* No argument. Center current line */
yStart = yEnd = pArg->arg.noarg.y;
break;
Similarly, in the LINEARG case, the pArg->arg structure contains three
values:
struct lineargType
{ /* line argument specified */
int cArg; /* count of args pressed */
LINE yStart; /* starting line of range */
LINE yEnd; /* ending line of range */
};
The CENTER.C example uses the starting and ending values in this structure
(yStart and yEnd) to center a range of selected lines:
case LINEARG: /* Center range of lines */
yStart = pArg->arg.linearg.yStart;
yEnd = pArg->arg.linearg.yEnd;
break;
The method is the same for other argument types. The pArg->arg structures
for all argument types are described in on-line help.
8.4.7 Calling PWB Functions
Many of PWB's internal functions are public. Your extension function can
call them for the same purposes that PWB itself does. This section
demonstrates the most commonly used PWB functions─those that manipulate the
current file.
A list of callable PWB functions appears near the end of this section. For
complete information on specific PWB functions, consult on-line help.
Getting a File Handle
Extension functions can do many different tasks, but they typically
manipulate a file in some way. The extension function in the CENTER.C
example rewrites a line or lines in the current file, for example. The
current file is the one that appears in the editing window. Since it is
already open for editing, you can access the current file without opening
it. Simply assign its file handle to a variable in your function.
PWB file-handling functions use file handles of type PFILE. The CENTER.C
example declares the following handle variable:
PFILE pFile;
The FileNameToHandle function gets a handle to a file that is already open
for editing:
pFile = FileNameToHandle( "", "" );
The function takes two string arguments. If the first string is null, as
here, the FileNameToHandle function returns a handle to the current file.
You can use the AddFile function to get handles to other files (in which
case you may need to use other PWB functions such as FileRead).
Reading a Line From the File
Once your function has a file handle, it can read from the file with the
GetLine function, which reads one line at a time:
len = GetLine( yStart, buf, pFile );
The first argument is a line number, the second a pointer to a buffer, and
the third a file handle. So the above call reads line number yStart from
the file whose handle is pFile into the buffer buf. Note that the first
line in a file is line 0, not line 1.
Once you have read a line into a local buffer, you can manipulate it as
desired. CENTER.C uses its buffer buf to center the line's text.
Writing a Line to the File
After modifying a line, you can write it back to the file. The PutLine
function writes one line at a time:
PutLine( yStart, buf, pFile );
PutLine takes the same arguments as GetLine─a line number, buffer pointer,
and file handle. In CENTER.C, the above call writes the line from buf to
line yStart in the file whose handle is pFile.
Summary of PWB Functions
If you understand how CENTER.C works, you know the basics of using PWB
functions in your own functions. The rest is just a matter of learning the
details of individual functions. Table 8.1 lists the PWB functions, grouping
them by category. For additional information on specific functions, consult
on-line help.
Table 8.1 Callable PWB Functions
╓┌──────────────────┌──────────────────┌─────────────────────────────────────╖
Category Function Description
────────────────────────────────────────────────────────────────────────────
Block Operations CopyBox Insert rectangular area
Category Function Description
────────────────────────────────────────────────────────────────────────────
Block Operations CopyBox Insert rectangular area
CopyLine Insert range of lines
CopyStream Insert stream of text
DelBox Delete rectangular area
DelLine Delete range of lines
DelStream Delete stream of text
Build fGetMake Get extmake setting
fSetMake Set extmake setting
Color GetColor Get color of specified line
PutColor Set color of specified line
Category Function Description
────────────────────────────────────────────────────────────────────────────
PutColor Set color of specified line
Cursor GetCursor Get cursor position
MoveCur Move cursor
Dialog DoMessageBox Create message dialog
PopUpBox Display text in dialog
window
Display BadArg Report that argument was invalid
Display Update screen
DoMessage Display message on dialog line
File AddFile Open new file and get file handle
Category Function Description
────────────────────────────────────────────────────────────────────────────
DelFile Delete contents of file buffer
fChangeFile Change current file to named file
FileNameToHandle Get handle to open file
FileRead Copy disk file to file
buffer
FileWrite Copy file buffer to disk file
Table 8.1 (continued)
╓┌────────────────┌─────────────────┌────────────────────────────────────────╖
Category Function Description
────────────────────────────────────────────────────────────────────────────
Category Function Description
────────────────────────────────────────────────────────────────────────────
pFileToTop Make specified file the current file
RemoveFile Remove file from memory
Keyboard KbHook Restore keyboard control to PWB
KbUnHook Remove keyboard control from PWB
ReadChar Get information on next keystroke
Format ReadCmd Get keystroke information in CmdDesc
Line FileLength Get length of file
GetLine Get line from file
PutLine Write line to file
List GetListEntry Get item from list
Category Function Description
────────────────────────────────────────────────────────────────────────────
List GetListEntry Get item from list
ScanList Process list
Memory Falloc Allocate far memory
Fdalloc Deallocate far memory
Miscellaneous fExecute Execute macro
FindSwitch Get information about switch
GetEditorObject Get internal PWB data item
GetString Get input from dialog line
mgetenv Get environment string
NameToFunc Get information about function or macro
Category Function Description
────────────────────────────────────────────────────────────────────────────
NameToFunc Get information about function or macro
NameToKeys Get key(s) assigned to specified
function
Replace Replace character
SetEditorObject Set internal PWB data item
SetKey Assign function to
keystroke
Search REsearch Search for regular
expression
search Search for string
Virtual Memory fpbtoVM Copy data to virtual memory
Category Function Description
────────────────────────────────────────────────────────────────────────────
VMalloc Allocate virtual memory
VMFree Free virtual memory
VMtofpb Copy data from virtual memory
Window CloseWnd Close window
Resize Resize window
SplitWnd Split window
────────────────────────────────────────────────────────────────────────────
8.4.8 Calling C Library Functions
You can write many useful extension functions using only PWB functions
listed in the previous section. It is also possible to call C library
routines, with some limitations. An extension written for OS/2 protected
mode can call any C library routine if it is linked with EXTHDRP.OBJ and the
.DLL C run-time library. The list of usable routines is shorter for
real-mode (DOS) extensions linked with the non-.DLL run-time library.
Before you call a C library routine, ask whether the task can be done with a
PWB function. If the answer is yes, you should always call a PWB function in
preference to the C library routine. This practice ensures compatibility
between your functions and PWB.
The following categories of C library routines are always safe to use in
real mode:
■ Buffer manipulation
■ Character classification and conversion
■ Data conversion
■ String manipulation
This list includes the library routines you are most likely to need in an
extension function. If your extension function calls C library functions,
you must link with the compact-model C library.
The following routines should not be used in real mode:
■ Routines that need C start-up support (most input/output functions)
■ Memory management routines, such as malloc, and routines that call
them
■ Process control routines such as spawn and exec
If you are in doubt about a particular C library routine, you can always use
it and see what happens. If the linker displays the following message,
error L2044: __acrtused : symbol multiply defined, use /NOE
the routine requires C start-up support and should not be used.
Chapter 9 Debugging C Programs with CodeView
────────────────────────────────────────────────────────────────────────────
Even experienced programmers occasionally find bugs in their programs. This
chapter explores techniques that will help you locate these errors quickly,
using the Microsoft CodeView debugger.
This chapter describes:
■ How to display and modify variables and memory
■ How to control the flow of execution while debugging
■ Advanced CodeView debugging techniques
■ How to control CodeView's behavior with command-line switches and the
TOOLS.INI file
CodeView supports the Microsoft mouse (or any fully compatible pointing
device). All operations are described first using the mouse; the keyboard
command follows.
For information about debugging OS/2 programs that use threads or processes,
see Chapter 15, "Creating OS/2 Multithread Applications."
9.1 Understanding CodeView Windows
CodeView divides the screen into logically separate sections called windows,
so that a large amount of information can be displayed in an organized and
easy-to-read fashion. Each window is a discrete section of the display that
operates independently of the other windows.
Each window displays a different type of data.
Each CodeView window has a distinct function. The name of each window
described below appears in the top of the window's frame:
■ The Source window displays the source code. You can open a second
Source window to view an include file, another source file, or the
same source file at a different location.
■ The Command window accepts debugging commands.
■ The Watch window displays the current values of selected variables.
■ The Local window lists the values of all variables local to the
current function or block.
■ The Memory window shows the contents of memory. You can open a second
Memory window to view a different section of memory.
■ The Register window displays the contents of the microprocessor's
registers, as well as the processor flags.
■ The 8087 window displays the registers of the coprocessor or its
software emulator.
CodeView starts running with three windows displayed. The Local window is at
the top, the Source window fills the middle of the screen, and the Command
window is at the bottom.
There are two ways to open windows. You can choose the desired window from
the View menu. (Note that you can open more than one of certain windows,
such as Source or Memory.) In addition, some operations (such as selecting a
Watch variable) open the appropriate window automatically, if it is not
already open.
All displays are updated automatically.
CodeView continually and automatically updates the contents of all windows.
However, if you want to interact with a particular window (for instance, to
enter a command, set a breakpoint, or modify a variable), you must select
that window as the focus of user interaction.
The selected window is called the "current" window. The current window is
marked in three ways:
■ The window's name is highlighted in white.
■ The text cursor appears in the window.
■ The vertical and horizontal scroll bars are moved into the window.
To select a new current window, click left in the window (position the mouse
cursor in the window and press the left mouse button) that you want to be
current. You can also press F6 or SHIFT+F6 to move the focus from one window
to the next.
Windows often contain more information than can be displayed in the area
allotted to the window. There are two ways to view these additional
contents. You can drag on the window's horizontal or vertical scroll bars.
(Position the mouse pointer on the bar and, while holding down the left
mouse button, drag the mouse in the appropriate direction.) You can also use
the direction keys (LEFT, RIGHT, UP, DOWN) to move the text cursor.
Typing commands into the Source window causes CodeView to temporarily shift
its focus to the Command window. Whatever you type is appended to the last
line in the Command window. If the Command window is closed, CodeView beeps
in response to your entry and ignores the input.
Adjusting the Windows
Although you cannot change the relative positions of the windows, you can
change their size or remove them. The Maximize, Size, and Close commands
from the View menu perform these functions, or you can press CTRL+F10,
CTRL+F8, and CTRL+F4, respectively. Window manipulations are especially easy
with a mouse:
■ To maximize a window (enlarge it so it fills the screen), click left
on the up arrow at the right end of the window's top border. To
restore the window to its previous size and position, click left on
the double arrow at the right end of the top border.
■ To change the size of a window, position the mouse pointer anywhere
along the white line at the top of the window. Press and hold down the
left mouse button. When two double arrows appear on the line, you can
drag the mouse to enlarge or reduce the window. The same action on a
vertical border widens or narrows the window.
■ To close a window, click left on the dot at the left end of the top
border. You can also close any window in the View menu whose name has
a dot next to it by selecting that window from the menu or by pressing
that window's acclerator key. The adjacent windows automatically
expand to recover the empty space.
CodeView stores session information in a file called CURRENT.STS, which is
created in the directory pointed to by the INIT environment variable. The
session information includes such items as the name of the program being
debugged, which CodeView windows were open, and the breakpoint locations.
This information becomes the default status the next time you run CodeView.
9.2 Overview of Debugging Techniques
There is no single best approach to debugging for all programs or users.
CodeView offers a variety of debugging tools that let you pick a method
appropriate to the program or your work habits. The following section may
help you decide how to approach a particular program.
Broadly speaking, two things can go wrong in a program:
■ The program doesn't manipulate the data the way you expected it to.
■ The flow of execution is incorrect.
These problems occasionally overlap. Incorrect execution can corrupt the
data, and bad data can cause execution to take an unexpected turn. Because
CodeView allows you to trace program execution and display whatever
combination of variables you want simultaneously, you don't have to know
ahead of time whether the problem is bad data manipulation, a bad execution
path, or some combination of these.
CodeView has features that deal specifically with the problems of bad data
and incorrect execution:
■ You can view and modify any program variable, any section of memory,
or any processor register.
■ You can monitor the path of execution and precisely control where
execution pauses.
The following sections explain how to view and modify data and describe how
execution is controlled.
9.3 Viewing and Modifying Program Data
The CodeView debugger offers a variety of ways to display program variables,
processor registers, and memory. You can also modify the values of all these
items as the program executes. This section shows how to display and modify
variables, registers, and memory.
9.3.1 Displaying Variables in the Watch Window
To add a variable to the Watch window, position the cursor on the name of
the variable using either the mouse or the direction keys (LEFT, RIGHT, UP,
DOWN). Then select the Add Watch command from the Watch menu, or press
CTRL+W.
A dialog box appears with the selected variable's name displayed in the
Expression field. If you don't want to watch the variable shown, type in the
name of the variable you want to watch. Pressing ENTER or clicking left on
the OK button adds this variable to the Watch window.
The Watch window appears at the top of the screen. Adding a Watch variable
automatically opens the Watch window if the window doesn't already exist.
A newly added variable may be followed by the message:
<Watch Expression Not in Context>
This message appears when program execution has not yet reached the block
where the variable is defined. (A block is a section of code enclosed in
curly braces.) Global variables (those declared outside C functions) never
cause CodeView to display this message; they can be watched from anywhere in
the program.
To remove a variable from the Watch window, use the Delete Watch command
from the Watch menu, and select the variable to be removed using the list in
the dialog box. You can also position the cursor on any line in the Watch
window and press CTRL+Y to delete the line.
There is no limit to how many variables you can watch.
You can place as many variables as you like in the Watch window; the
quantity is limited only by available memory. You can scroll through the
Watch window to position it at those variables you want to view. CodeView
automatically updates all watched variables as the program runs, including
those not currently visible.
Loops (do, for, or while) cause problems when they don't terminate
correctly. Displaying loop variables in the Watch window is an easy way to
determine whether a loop variable achieves its proper value.
9.3.2 Displaying Expressions in the Watch Window
You may have noticed that the Add Watch dialog box prompts for an
expression, not simply a variable name. As this suggests, you can enter an
expression (that is, any valid combination of variables, constants, and
operators) for CodeView to evaluate and display.
Expressions can use the syntax of other languages.
You are not limited to evaluating C expressions. The Language command of the
Options menu offers a choice of BASIC or FORTRAN expression evaluation, if
one of these languages better suits your needs. The ability to select the
language evaluator is especially useful when debugging mixed-language
programs. Remember that C-specific features, such as type casting or pointer
conversions, are not available in other languages.
You can display more information with expressions than with individual
variables.
By reducing several variables to a single, easily read value, an expression
can be easier to interpret than the components that make it up. Imagine a
for loop with two variables whose ratio is supposed to remain constant. You
suspect that one of these variables (you aren't sure which) sometimes takes
the wrong value. With (var1 / var2) displayed as an expression in the Watch
window, you can easily see when this single value changes; you don't have to
mentally divide two numbers.
You can also display Boolean expressions. For example, if a variable is
never supposed to be larger than 100 or less than 25, (var < 25 || var >
100) evaluates to 1 (true) when var goes out-of-bounds.
9.3.3 Displaying Arrays and Structures
Most program variables are scalar quantities─a single character or a single
integer or floating-point value. These appear in the Watch window with the
variable name to the left, followed by an equal sign (=) and the current
value.
You can view arrays and structures in expanded form.
Arrays and structures contain multiple values, arranged in one or more
layers. They are often referred to as "aggregate" data items. CodeView lets
you control how much of these variables is shown; that is, whether all,
part, or none of their internal structure is displayed.
An array initially appears in the Watch window in this form:
+wordholder[] = [...]
The brackets indicate that this variable contains more than one element. The
plus sign (+) indicates that the variable has not yet been expanded to
display its components.
To expand the array, double-click anywhere on the line. You can also
position the cursor on the line and press ENTER. For example, if wordholder
is a six-character array containing the word "Basic," the Watch window
display changes to the following :
-wordholder[]
[0] = 66 'B'
[1] = 97 'a'
[2] = 115 's'
[3] = 105 'i'
[4] = 99 'c'
[5] = 0 ''
Note that both the individual character values and their ASCII decimal
equivalents are listed. The minus sign (-) indicates no further expansion is
possible. To contract the array, double-click on its line (or position the
cursor on the line and press ENTER) again.
If it is inconvenient to view a character array in this form, cast the
variable's name to a character pointer by placing (char *) in front of the
name. The character array is then displayed as a string delimited by
apostrophes.
You can display arrays with more than one dimension. Imagine a 5 x 5 integer
array named matrix, whose diagonal elements are the numbers 1 through 5 and
whose other elements are zero. Unexpanded, the array is displayed like this:
+matrix[] = [...]
Double-clicking on matrix (or pressing ENTER) changes the display:
-matrix[]
+[0][] = [...]
+[1][] = [...]
+[2][] = [...]
+[3][] = [...]
+[4][] = [...]
The actual values of the elements are not shown yet. You have to descend one
more level to see them. To view the elements of the third row of the array,
position the cursor anywhere on the fourth line and press ENTER:
-matrix[]
+[0][] = [...]
+[1][] = [...]
-[2][]
[0] = 0
[1] = 0
[2] = 3
[3] = 0
[4] = 0
+[3][] = [...]
+[4][] = [...]
Expanding the fifth row of the array produces this display:
-matrix[]
+[0][] = [...]
+[1][] = [...]
-[2][]
[0] = 0
[1] = 0
[2] = 3
[3] = 0
[4] = 0
+[3][] = [...]
-[4][]
[0] = 0
[1] = 0
[2] = 0
[3] = 0
[4] = 5
You can view individual elements instead of the entire array.
Any element of an array (or structure) can be independently expanded or
contracted. If you only want to view one or two elements of a large array,
specify the particular array or structure elements in the Expression field
of the Add Watch dialog box; you need not display every element of the
variable.
You can dereference pointers.
You can dereference a pointer in the same way as you expand an array or
structure. The pointer address is displayed, followed by all the elements of
the variable to which the pointer currently refers. Multiple levels of
indirection (that is, pointers referencing other pointers) can be displayed
simultaneously.
9.3.4 Displaying Array Elements Dynamically
You do not have to display every element of an array. If specific subscripts
are given, the corresponding element is displayed.
You can also specify a dynamic array element, which changes as some other
variable changes. For example, suppose that the loop variable p is a
subscript for the array variable catalogprice. The Watch window expression
catalogprice[p] displays only the array element currently specified by p,
not the entire array.
You can mix constant and variable subscripts. For example, the expression
bigarray[3][i] displays only the element in the third row of the array to
which the index variable i points.
9.3.5 Using Quick Watch
Selecting the Quick Watch command from the Watch menu (or pressing SHIFT+F9)
displays the Quick Watch dialog box. If the text cursor is in the Source,
Local, or Watch window, the variable at the current cursor position appears
in the dialog box. If this is not the item you wish to display, type in the
desired expression or variable, then press ENTER. The selected item is
displayed immediately.
The Quick Watch display automatically expands arrays and structures to their
first level. For example, an array with three dimensions is expanded to the
first dimension. You can expand or contract an element just as you would in
the Watch window: position the cursor on the appropriate line and press
ENTER. If the array needs more lines than the Quick Watch window can
display, drag the mouse along the scroll bar, or press DOWN or PGDN to view
the rest of the array.
You can add Quick Watch variables to the Watch window.
If you decide to add a Quick Watch item to the Watch window, select the Add
Watch button. Arrays and structures appear in the Watch window expanded as
they were displayed in the Quick Watch box.
Quick Watch is a convenient way to take a quick look at a variable or
expression. Since only one Quick Watch variable can be viewed at a time, you
would not use Quick Watch for most of the variables you want to view.
9.3.6 Displaying Memory
Selecting the Memory command from the View menu opens a Memory window. Up to
two Memory windows can be open at one time.
By default, memory is displayed as hexadecimal byte values, with 16 bytes
per line. At the end of each line is a second display of the same memory in
ASCII form. Values that correspond to printable ASCII characters (decimal 32
through 127) are displayed in that form. Values outside this range are shown
as periods.
You can display memory values in any form.
Byte values are not always the most convenient way to view memory. If the
area of memory you're examining contains character strings or floating-point
values, you might prefer to view them in a directly readable form. The
Memory Window command of the Options menu displays a dialog box with a
variety of display options:
■ ASCII characters
■ Byte, word, or double-word binary values
■ Signed or unsigned integer decimal values
■ Short (32 bit), long (64 bit), or ten-byte (80 bit) floating-point
values
You can also directly cycle through these display formats by pressing F3.
If a section of memory cannot be displayed as a valid floating-point number,
the number shown includes the characters NAN (not a number).
Displaying Variables with a Live Expression
Section 9.3.4, "Displaying Array Elements Dynamically," explains how to
display a specific array element by adding the appropriate expression to the
Watch window. It is also possible to watch a particular memory area that
your program uses to store data in the Memory window. This CodeView display
feature is called a "live expression."
"Live" means that the area of memory displayed changes to reflect the value
of a pointer or subscript. For example, if buffer is an array and pbuf
is a pointer to that array, then *pbuf points to the array element
currently referenced. A live expression displays the section of memory
beginning with this element. If your program changes the value of pbuf,
CodeView dynamically adjusts the Memory window display.
Live expressions are displayed in a Memory window, not in the Watch window.
To create a live expression, select the Memory Window command of the Options
menu, then select the Live Expression check box. Enter the name of the
element you want to view. For example, if strgptr is a pointer to an array
of characters, and you want to see what it currently points at, enter
*strgptr. Then select the OK button or press ENTER to view that memory area.
A new Memory window opens. The first memory location in the window is the
first memory location of the live expression. The section of memory
displayed changes to the section the pointer currently references.
You can use the Memory Window command of the Options menu to display the
value of the live expression in a directly readable form. This is especially
convenient when the live expression represents strings or floating-point
values, which are difficult to interpret in hexadecimal form.
It is usually more convenient to view an item in the Watch window than as a
live expression. However, some items are more easily viewed as live
expressions. For example, you can examine what is currently on top of the
stack. Enter SS:SP as the live expression.
9.3.7 Displaying the Processor Registers
Selecting the Register command from the View menu (or pressing F2) opens a
window on the right side of the screen. The current values of the
microprocessor's registers appear in this window.
At the bottom of the window is a group of mnemonics representing the
processor flags. When you first open the Register window, all values are
shown in normal-intensity video. Any subsequent changes are marked in
high-intensity video. For example, suppose the overflow flag is not set when
the Register window is first opened. The corresponding mnemonic is NV and it
appears in light gray. If the overflow flag is subsequently set, the
mnemonic changes to OV and appears in bright white.
Selecting the 386 Instructions command from the Options menu displays the
registers as 32-bit values, but only if your computer uses an 80386
processor, and only when running the real-mode version of CodeView.
Selecting this command a second time toggles back to a 16-bit display.
You can also display the registers of an 8087/287/387 coprocessor in a
separate window by selecting the 8087 command from the View menu. If your
program uses the coprocessor emulator, the emulated registers are displayed
instead.
9.3.8 Modifying the Values of Variables, Registers, and Memory
You can easily change the values of variables, memory locations, or
registers displayed in the Watch, Local, Memory, Register, or 8087 windows.
Simply position the cursor at the value you want to change and edit it to
the appropriate value. If you change your mind, press ALT+BKSP to undo the
last change you made.
The starting address of each line of memory displayed is shown at the left
of the Memory window, in CS:IP form. Altering the address automatically
shifts the display to the corresponding section of memory. If that section
is not used by your program, memory locations are displayed as double
question marks (??).
Byte display form is different from other forms.
When you select Byte display from the Memory Window Options dialog box,
CodeView presents both a hexadecimal and an ASCII representation of the data
in memory. (Byte display is the default.) You can change data in memory
either by entering new hex values over the hexadecimal representation of
your data or by entering character values over the character representation.
To toggle a processor flag, click left on its mnemonic. You can also
position the cursor on a mnemonic, then press any key (except TAB or SPACE).
Repeat to restore the flag to its previous setting.
Be cautious when modifying memory or a register.
The effect of changing a register, flag, or memory location may vary from no
effect at all, to crashing the operating system. You should be cautious when
altering "machine-level" values; most of the items you would want to change
can be altered from the Watch window.
One instance where direct manipulation of register values can be valuable is
when you are debugging in-line assembly code. You can change register values
to test assumptions before making changes in your source code and
recompiling.
9.4 Controlling Execution
There are two forms of program execution under CodeView:
■ Continuous; the program executes until either a previously specified
"breakpoint" has been reached or the program terminates normally.
■ Single-step; the program pauses after each line of code has been
executed.
Sections 9.4.1 and 9.4.2 explain how each form of execution works and the
most effective way to use each.
9.4.1 Continuous Execution
Continuous execution lets you quickly execute the bug-free sections of code,
which would otherwise take a long time to execute a single step at a time.
The simplest form of continuous execution is to click right (position the
mouse pointer and press the right mouse button) anywhere on the line of code
you want to debug or examine in more detail. The program executes at full
speed up to the beginning of this line, then pauses. You can do the same
thing by positioning the text cursor on this line, then pressing F7.
You can also pause execution at a specific line of code with a "breakpoint."
There are several types of breakpoints. Breakpoints are explained in the
following section.
Selecting Breakpoint Lines
Breakpoints can be tied to lines of code.
You can skip over the parts of the program that you don't want to examine by
specifying one or more lines as "breakpoints." The program executes at full
speed up to the first breakpoint, then pauses. Pressing F5 continues program
execution up to the next breakpoint, and so on. (You can halt execution at
any time by pressing CTRL+BREAK or ALT+SYSRQ.)
There is no limit to the number of breakpoints.
You can set as many breakpoints as you like (limited only by available
memory). There are several ways to set breakpoints:
■ Double-click anywhere on the desired breakpoint line. The selected
line is highlighted to show that it is a breakpoint. To remove the
breakpoint, double-click on the line a second time.
■ Position the cursor anywhere on the line at which you want execution
to pause. Press F9 to select the line as a breakpoint. (CodeView
highlights lines that have been selected as breakpoints.) Press F9 a
second time to remove the breakpoint.
■ Display the Set Breakpoint dialog box by selecting Set Breakpoint from
the Watch menu. Choose one of the breakpoint options that permits a
line ("location") to be specified. The line on which the text cursor
currently rests is the default breakpoint line in the Location field.
If this line is not the desired breakpoint, enter the line number
desired. (The line number must begin with a period.) Use F9 or the
Edit Breakpoints screen of the Watch menu to remove the breakpoint.
Not every line can be a breakpoint.
A breakpoint line must be a program line that represents executable code.
You cannot select a blank line, a comment line, or a declaration line (such
as a variable declaration or a preprocessor statement) as a breakpoint.
A breakpoint can also be set at a function or an explicit address. To set a
breakpoint at a function, simply enter its name in the Set Breakpoint dialog
box. To set a breakpoint at an address, enter the address in CS:IP form.
────────────────────────────────────────────────────────────────────────────
NOTE
By default, Microsoft compilers optimize your code. In the process of
optimization, some lines of code may be repositioned or reorganized for more
efficient execution. These changes can prevent CodeView from recognizing the
corresponding lines of source code as breakpoints. Therefore, it is a good
idea to disable optimization during development (use the /Od switch). You
can restore optimization once debugging is completed.
────────────────────────────────────────────────────────────────────────────
Once execution has paused, you can continue execution by pressing F5 or
clicking left on the <F5> button in the display.
Setting Breakpoint Values
Breakpoints can be tied to variables.
Breakpoints are not limited to specific lines of code. CodeView can also
break execution when a variable reaches a particular value, or just changes
value. You can also combine these value breakpoints with line breakpoints,
so that execution stops at a specific line only if a variable has
simultaneously reached a particular value, or changed value. You must use
the check boxes in the Set Breakpoint dialog box to select these other types
of breakpoints.
To pause execution when an expression reaches a particular value, enter that
expression in the Expression field of the Set Breakpoint dialog box. For
example, assume you have declared a tree structure as follows:
struct Tagtree
{
char * s; /* Pointer to a string */
struct TAGtree * left; /* Pointer to left branch */
struct TAGtree * right; /* Pointer to right branch */
};
struct TAGtree t;
You can then pause execution when your tree traversal reaches a terminal
node by entering the expression (t.left == NULL) || (t.right == NULL).
To pause execution when a variable changes value, you need to enter only the
name of the variable in the Expression field. For large variables (such as
arrays or character strings), you can specify the number of bytes you want
checked (up to 32K) in the Length field.
────────────────────────────────────────────────────────────────────────────
NOTE
When a breakpoint is tied to a variable, CodeView must check the variable's
value after each machine instruction is executed. This slows execution
greatly. For maximum speed when debugging, either tie conditional
breakpoints to specific lines, or set conditional breakpoints only after you
have reached the section of code that needs to be debugged.
────────────────────────────────────────────────────────────────────────────
Using Breakpoints
Here are several examples that show how breakpoints can help you find the
cause of a problem.
One of the most common bugs is a for loop that executes too many or too few
times. If you set a breakpoint that encloses the loop statements, the
program pauses after each iteration. With the loop variable or critical
program variables in the Watch or Local windows, it should be easy to see
what the loop is doing wrong.
You can specify how many times a breakpoint line is executed.
You do not have to pause at a breakpoint the first time execution reaches
it. CodeView lets you specify the number of times you want to ignore the
breakpoint condition before pausing. Enter the decimal number in the Pass
Count field of the Set Breakpoint dialog box of the Watch menu.
For example, suppose your program repeatedly calls a function to create a
binary tree. You suspect that something goes wrong with the process about
halfway through. You could mark the line that calls the function as the
breakpoint, then specify how many times this line is to execute before
execution pauses. Running the program creates a representative (but
unfinished) tree structure that can be examined from the Watch window. You
can then continue your analysis using single-stepping.
Another programming error is erroneously assigning a value to a variable.
Enter the variable in the Expression field of the Set Breakpoint dialog box.
Execution breaks whenever this variable changes value.
You can assign new values to variables while execution is paused.
Breakpoints are a convenient way to pause the program so you can assign new
values to variables. For example, if a limit value is set by a variable, you
can change the value to see whether program execution is affected.
Similarly, you can pass a variety of values to a switch statement to see if
they are correctly processed.
This ability to alter variables is an especially convenient way to test new
functions without having to write a stand-alone test program.
9.4.2 Single-Stepping
In single-stepping, CodeView pauses after each line of code is executed. (If
a line contains more than one executable statement, CodeView executes all
the statements on the line before pausing.) The next line to be executed is
highlighted in reverse video.
There are two ways to single-step.
You can single-step through a program with the Step and Trace functions.
Step (executed by pressing F10) steps over function calls. All the code in
the function is executed but, to you, the function appears to execute as a
single step. Trace (executed by pressing F8) traces through every step of
all functions for which CodeView has symbolic information. Each line of the
function is executed as a separate step. (CodeView has no symbolic
information about run-time functions; therefore, they are executed as a
single step.)
You can alternate between Trace and Step as you like. The method you use
depends only on whether you want to see what happens within a particular
function.
You can Trace through the program continuously (without having to press F8),
using the Animate command of the Run menu. The speed of execution is
controlled by the Trace Speed command from the Options menu. You can halt
animated execution at any time by pressing any key.
9.5 Replaying a Debug Session
CodeView can automatically create a "tape" (a disk file) with all the
debugging instructions and input data you entered when testing a program.
The tape is then "replayed" to repeat the debugging process. This dynamic
replay feature is unique to the CodeView debugger and is activated by
selecting the History On command from the Run menu. Selecting History On a
second time terminates recording.
You can use the recording as a bookmark. You can quit after a long debugging
session, then pick up the session later in the same place.
Dynamic replay makes it easy to correct a mistake.
The principal use of dynamic replay is to allow you to back up when you make
an error or overshoot the section of code with the bug. This feature is
important because not all bugs are located when executing the program in a
linear fashion.
For example, you may have to manually execute a function many times before
its bug appears. If you then enter a command that alters the machine's or
program's status and thereby lose the information you need to find the cause
of the bug, you would have had to restart the program and manually repeat
every debugging step to return to that point. Even worse, if you don't
remember the exact sequence of events that exposed the bug, it could take
hours to find your way back.
Dynamic replay eliminates this problem. Selecting the Undo command from the
Run menu automatically restarts the program and rapidly executes every debug
command up to (but not including) the last one you entered. You can repeat
this process as many times as you like until you return to the desired point
in execution.
To add additional steps to an existing tape, select History On, then select
Replay. When replay has completed, perform whatever new debugging steps you
want, then select History On a second time to terminate recording. The new
tape contains both the original and the added commands.
────────────────────────────────────────────────────────────────────────────
NOTE
CodeView records only those mouse commands that apply to CodeView. Mouse
commands recognized by the application being debugged are not recorded.
────────────────────────────────────────────────────────────────────────────
Replay Limitations under OS/2
There are some limitations to dynamic replay when debugging under OS/2:
■ The program must not respond to asynchronous events.
■ Breakpoints must be specified at specific source lines or for specific
symbols (rather than by absolute addresses), or replay may fail.
■ Single-thread programs behave normally during replay. However, one of
the threads in a multithread program may cause an asynchronous event,
violating the first restriction. Multithread programs are, therefore,
more likely to fail during replay.
■ Multiprocess replay will fail. Each new process invokes a new CodeView
session. The existence of multiple sessions makes it impractical to
record the sequence of events if you execute commands in a session
other than the original.
■ Replay under Presentation Manager is not currently supported because
it violates the first restriction.
9.6 Advanced CodeView Techniques
Once you are comfortable displaying and changing variables, stepping through
the program, and using dynamic replay, you might want to experiment with the
advanced techniques explained below.
Setting Command-Line Arguments
If your program retrieves command-line arguments, you can specify them with
the Set Runtime Arguments command from the Run menu. Enter the arguments in
the Command Line field before you begin execution. (Arguments entered after
execution begins cause an automatic restart.)
Multiple Source Windows
You can open two Source windows at the same time. The windows can display
two different sections of the same program, or one can show the high-level
listing and the other the assembly-language listing. In the latter case, the
contents of the windows track, with the next assembly-language instruction
to be executed matching the next line of source code.
You can move freely between these windows, executing a single line of source
code or a single assembly instruction at a time. The assembly-language
window must be opened in CS:IP mode.
Calling Functions
Any C function in your program (whether user-written or from the library)
can be called from the Command window or the Watch window, using the
following format:
?funcname (varlist)
The function is evaluated and the returned value is displayed in the Command
window.
The function does not have to be called by your program to be available for
evaluation. For example, all the .OBJ code specified in the linker input
response file is linked. The functions in this code can then be evaluated
from the Command window.
This feature allows you to run functions from within CodeView that you would
not normally include in the final version of your program. For example, you
could include the OS/2 API functions that control semaphores, then execute
them from the Command window to manipulate the run-time environment at any
point in the debugging process.
Checking for Undefined Pointers
Until a pointer has been explicitly assigned a value, its value is
undefined. That is, its value may be completely random, or it may be some
consistent value that does not point to a useful data address (such as -1).
Accessing data through an uninitialized pointer will cause unpredictable
program behavior and, under OS/2, will usually result in a protection
violation. Because many C programs use pointers heavily, tracking down
exactly which pointer variable was left uninitialized is tedious.
CodeView can help locate the problem quickly. If you use an uninitialized
pointer (or "null pointer" under OS/2) the operating system will generate a
protection violation. By examining the Calls menu, you can determine the
last line of your code that was executed before the protection violation
occurred.
Under DOS, you can take advantage of the fact that global or static
variables are initialized to 0 to track down uninitialized pointers. Set a
conditional breakpoint that stops when location 0 changes, then start
execution. Execution will pause when your program makes an assignment to
that location.
────────────────────────────────────────────────────────────────────────────
NOTE
For near pointers, location 0 is DS:0000; for far pointers, location 0 is
0000:0000.
────────────────────────────────────────────────────────────────────────────
Using Breakpoints Efficiently
Breakpoints slow execution when debugging. You can increase CodeView's speed
by using the /R command-line switch if you have an 80386-based computer.
This switch enables the 386's four debug registers, which support breakpoint
checking in hardware rather than in software.
Printing Selected Items
You can print all or part of the contents of any window with the Print
command from the File menu. The check box lets you print the complete
contents of the window, only the material that is currently viewable in the
window, or selected text from the window. Text is selected by dragging the
mouse across it, or by holding down the SHIFT key and pressing the direction
keys (LEFT, RIGHT, UP, DOWN).
By default, print output is to the file CODEVIEW.LST in the current
directory. You can choose whether the new material will be appended to an
existing file or overwrite it, using the Append/Overwrite check box. If you
would like print output to go to a different file, type its name in the To
File Name field. If you want the output to go to a printer, enter the
appropriate device name, such as LPT1 or COM2.
Handling Register Variables
A register variable is stored in one of the microprocessor's registers,
rather than in RAM. This speeds access to the variable.
There are two ways for a conventional variable to become a register
variable. One way is declaring the variable as a register variable; if a
register is free, the compiler will store the variable there. The other way
occurs during optimization, when the compiler stores an often-used variable
(such as a loop variable) in a register to speed up execution.
Register variables can cause problems during debugging. As with local
variables, they are only visible within the function where they are defined.
In addition, a register variable may not always be displayed with its
current value.
In general, it is a good idea to turn off all optimization and to avoid
declaring register variables until the program has been fully debugged. Any
side effects produced by optimization or register variables can then be
easily isolated.
Redirecting CodeView Input and Output
The Command window accepts DOS-like commands that redirect input and output.
These commands can also be included on the command line that invokes
CodeView. Whatever follows the /C option in the command line is treated as
CodeView commands that are immediately executed at start-up.
CV/c "infile; t >outfile" myprog
Input is redirected to infile, which can contain start-up commands for
CodeView. When CodeView exhausts all commands in the input file, focus
automatically shifts to the command window. Output is sent to outfile and
echoed to the Command window. The t must precede the > command for
output to be sent to the Command window.
Redirection is a useful way to automate CodeView start-up. It also lets you
keep a viewable record of command-line input and output, a feature not
available with dynamic replay. (No record is kept of mouse operations.) Some
applications (particularly interactive ones) may need modification to allow
for redirection of input to the application itself.
Using CodeView with Additional Memory
If your computer uses expanded or extended memory, you can increase
CodeView's functionality by selecting the /X or /E option. CodeView moves as
much as it can of itself, the debugging table, and the program to higher
memory (above the first megabyte).
The /X option uses extended memory and gives the greatest speed increase.
This option requires the HIMEM.SYS driver, which is included on your
distribution disks. Add DEVICE = HIMEM.SYS to your CONFIG.SYS file to load
HIMEM.SYS at boot time.
The /E option uses expanded memory. The speed increase is not as great as
that supplied by the /X option. The expanded memory manager (EMM) must be
LIM 4.0, and no single module's debug information can exceed 48K. If the
symbol table exceeds this limit, try reducing file-name information by not
specifying paths at compile time and using /Zi only with those sections of
the program that need debugging (use /Zd otherwise).
If you do not specify either /X or /E (or the /D disk-overlay option),
CodeView automatically searches for the HIMEM.SYS driver and extended memory
so it can implement the /X option. If it fails, CodeView searches for
expanded memory to implement the /E option. If that search fails, CodeView
uses a default disk overlay of 64K. (See the description of the /D option
below.)
9.7 Controlling CodeView with Command-Line Options
The following options can be added to the command line that invokes
CodeView:
Option Effect
────────────────────────────────────────────────────────────────────────────
/2 Two-monitor debugging. The display
adapters must be configured for
different addresses. One display shows
the output of the application; the other
shows CodeView.
/25 Display in 25-line mode.
/43 Display in 43-line mode (EGA or VGA
only).
/50 Display in 50-line mode (VGA only).
/B Display in black and white. This assures
that the display is readable when a
color display is not used.
/Ccommands All items following this switch are
treated as CodeView commands to be
executed immediately on start-up.
Commands must be separated with a
semicolon (;).
/D«ddd» Use disk overlays, where ddd is the
decimal size of the overlay buffer, in
kilobytes. The acceptable range is 16K
to 128K. The default size is 64K. DOS
only.
/E Use expanded memory for symbolic
information. DOS only.
/F Flip screen video pages. When your
application does not use graphics, eight
video screen pages are available.
Switching from CodeView to the output
screen is accomplished more quickly than
swapping (/S) by directly selecting the
appropriate video page. Cannot be used
with /S. DOS only.
/Inumber Turns nonmaskable interrupts and
8259-interrupt trapping on (/I1) or off
(/I2).
/K Disables installation of keyboard
monitors for the program being debugged.
/Ldlls Load DLLs specified. DLLs must be
separated by a semicolon (;). OS/2 only.
/M Disable the mouse.
/Nnumber /N0 tells CodeView to trap; /N1 tells it
not to.
/O Debug child processes ("offspring").
OS/2 only.
/R Use 386 hardware debug registers. DOS
only.
/S Swap screen in buffers. When your
program uses graphics, all eight screen
buffers must be used. Switching from
CodeView to the output screen is
accomplished by saving the previous
screen in a buffer. Cannot be used with
/F. DOS only.
/X Use extended memory for symbolic
information. DOS only.
9.8 Customizing CodeView with the TOOLS.INI FILE
The TOOLS.INI file customizes the behavior and user interface of several
Microsoft products. The TOOLS.INI file is a plain ASCII text file. You
should place it in a directory pointed to the INIT environment variable. (If
you do not use the INIT environment variable, CodeView looks for TOOLS.INI
only in its source directory.)
The CodeView section of TOOLS.INI is preceded by the following line:
[cv]
If you are running the protected-mode version of CodeView, use [cvp]
instead. If you run both versions, include both: [cv cvp].
Most of the TOOLS.INI customizations control screen colors, but you can also
specify such things as start-up commands or the name of the file that
receives CodeView output. On-line help contains full information about all
TOOLS.INI switches for CodeView.
PART III Special Environments
────────────────────────────────────────────────────────────────────────────
The Microsoft C Professional Development System provides a platform from
which you can build graphics applications and interface with programs
written in other languages.
Chapter 10 discusses using the real-world graphics functions to set video
modes, draw basic shapes, and use graphic fonts. Chapter 11 describes
"presentation graphics," sophisticated charts and graphics that show data
relationships. Chapter 12 explains how to write C programs so that they
interface with assembly language routines or routines written in other
languages. Chapter 13 describes portability of Microsoft C to other
environments.
Chapter 10 Communicating with Graphics
────────────────────────────────────────────────────────────────────────────
A map, a chart, an illustration, a graph, or some other visual aid often can
communicate more information more quickly and more vividly than would
several screens of text.
The extensive Microsoft C graphics library allows you to communicate your
ideas graphically. The functions range from the simple to the complex; from
functions that turn on a pixel to functions that draw graphs and charts
complete with labels and legends.
This chapter describes low-level graphics functions that draw basic shapes
such as lines, circles, and rectangles. It introduces video modes, color
palettes, coordinate systems, and synopses of the graphics and font
functions. For complete function prototypes and example programs, use
on-line help.
────────────────────────────────────────────────────────────────────────────
NOTE
The ANSI C standard does not define any standard graphics functions. The
functions described in this section are unique to Microsoft C and are not
portable to other implementations of C.
────────────────────────────────────────────────────────────────────────────
10.1 Video Modes
Graphics adapters are boards or cards inside the computer that are
responsible for displaying text and graphics on the screen. Commonly used
adapters include:
■ CGA (Color Graphics Adapter)
■ EGA (Enhanced Graphics Adapter)
■ HGC (Hercules Graphics Card)
■ MCGA (Multicolor Graphics Array)
■ MDPA (Monochrome Display Printer Adapter)
■ VGA (Video Graphics Array)
In addition, there are Olivetti versions of the CGA, EGA, and VGA (called
OCGA, OEGA, and OVGA in this chapter).
The video modes available at run time depend on your graphics adapter and
monitor.
Adapters can enter one or more "video modes." The video mode controls the
resolution and number of colors on the video display. Microsoft C supports
17 video modes, which fall into two broad categories:
■ "Text modes," where characters are displayed
■ "Graphics modes," where individual pixels can be turned on and off
The graphics adapter and the type of monitor in use determine which of the
17 video modes are available at run time. See Section 10.1.2 for a list of
video modes.
10.1.1 Sample Low-Level Graphics Program
The program ERESBOX.C below shows, in a few lines, the steps you follow to
enter and exit a graphics mode. It sets the video mode _ERESCOLOR, draws a
box, waits for a keypress, and returns to default mode, which is the video
mode in effect when the program began running.
/* ERESBOX.C -- Enters _ERESCOLOR mode and draws a box */
#include <graph.h> /* graphics functions */
#include <stdio.h> /* puts */
#include <conio.h> /* getch */
main()
{
if( _setvideomode( _ERESCOLOR ) ) /* EGA 640x350 mode */
{
_rectangle( _GBORDER, 10, 10, 110, 110 ); /* draw */
getch(); /* wait for a keypress */
_setvideomode( _DEFAULTMODE ); /* return to default */
} else puts( "Can't enter _ERESCOLOR graphics mode." )
}
The program above illustrates the steps you follow to display graphics:
■ Include the header file GRAPH.H. It contains function prototypes,
macros, useful structures, and symbolic constants such as _ERESCOLOR,
_GBORDER, and _DEFAULTMODE.
#include <graph.h>
■ Call the _setvideomode function, which sets the desired video mode.
The function returns 0 if the hardware does not support the requested
mode. (See Section 10.1.2, "Setting a Video Mode.")
if( _setvideomode( _ERESCOLOR ) )
■ Draw the graphics on the screen. The example program calls the
_rectangle function. (See Section 10.4.3, "Drawing Points, Lines, and
Shapes.")
_rectangle( _GBORDER, 10, 10, 110, 110 )
■ Exit the graphics mode and return to whatever video mode was in effect
before the program began. Call _setvideomode, passing the constant
_DEFAULTMODE. In some cases, you might want to skip this step, exiting
the program with the graphics screen still in place.
_setvideomode( _DEFAULTMODE );
In addition, you must link with the GRAPHICS.LIB library, which contains the
function code. If you use window-coordinate functions (which require
floating-point calculations) and if you have not created a standard combined
library containing a floating-point component, you must explicitly link with
a floating-point math library.
10.1.2 Setting a Video Mode
The _setvideomode function turns on one of the 17 available video modes.
Pass it a single integer that tells it which mode to display. The constants
in Table 10.1 are defined in the GRAPH.H file. The dimensions are listed in
pixels for video graphics mode and in columns for video text mode.
Table 10.1 Constants that Represent Video Modes
╓┌────────────────┌───────────────────────────────────────┌──────────────────╖
Constant (Name) Description Mode/Hardware
────────────────────────────────────────────────────────────────────────────
_DEFAULTMODE Restores the original mode All/All
_ERESCOLOR 640 x 350, 4 or 16 color Graphics/EGA
_ERESNOCOLOR 640 x 350, BW Graphics/EGA
_HRES16COLOR 640 x 200, 16 color Graphics/EGA
Constant (Name) Description Mode/Hardware
────────────────────────────────────────────────────────────────────────────
_HERCMONO* 720 x 348, BW Graphics/HGC
_HRESBW 640 x 200, BW Graphics/CGA
_MAXCOLORMODE Graphics mode with the most colors Graphics/All┼
_MAXRESMODE Graphics mode with the highest Graphics/All┼
resolution
_MRES4COLOR 320 x 200, 4 color Graphics/All
_MRES16COLOR 320 x 200, 16 color Graphics/EGA
_MRES256COLOR 320 x 200, 256 color Graphics/VGA
_MRESNOCOLOR 320 x 200, 4 gray Graphics/CGA
_ORESCOLOR 640 x 400, 1 of 16 colors Graphics/Olivetti
Constant (Name) Description Mode/Hardware
────────────────────────────────────────────────────────────────────────────
_ORESCOLOR 640 x 400, 1 of 16 colors Graphics/Olivetti
_TEXTBW40 40 column text, 16 gray Text/CGA
_TEXTBW80 80 column text, 16 gray Text/CGA
_TEXTC40 40 column text, 16/8 color Text/CGA
_TEXTC80 80 column text, 16/8 color Text/CGA
_TEXTMONO 80 column text, BW Text/MDPA
_VRES2COLOR 640 x 480, BW Graphics/VGA
_VRES16COLOR 640 x 480, 16 color Graphics/VGA
────────────────────────────────────────────────────────────────────────────
* Before attempting to enter _HERCMONO mode, you must install the
terminate-and-stay-resident program MSHERC.COM, which comes in the Microsoft
C package. If you have both a Hercules adapter and an additional graphics
adapter in the same computer, use the /H option to put the Hercules into
HALF mode to avoid unpredictable and undesirable results.
┼ _MAXRESMODE and _MAXCOLORMODE support all adapters except the MDPA. See
Section
If the hardware does not support the selected mode, _setvideomode returns 0.
Some graphics adapters are able to enter additional video modes:
■ EGA adapters can display all CGA modes.
■ HGC adapters can enter _TEXTMONO mode.
■ MCGA adapters can display all CGA modes, plus _VRES2COLOR and
_MRES256COLOR.
■ VGA adapters can display all EGA and CGA modes.
10.1.3 Reading the videoconfig Structure
At any time, you can inquire about the current video configuration by
passing the _getvideoconfig function a structure of type videoconfig. The
structure contains 11 members, all of which are short integers. They are
listed in Table 10.2.
Table 10.2 Members of a videoconfig Structure
╓┌──────────────────────┌────────────────────────────────────────────────────╖
Member Description
────────────────────────────────────────────────────────────────────────────
adapter* Active display adapter
bitsperpixel Number of bits per pixel
memory Adapter video memory in kilobytes
Member Description
────────────────────────────────────────────────────────────────────────────
memory Adapter video memory in kilobytes
mode* Current video mode
monitor* Active display monitor
numcolors Number of color indexes
numtextcols Number of text columns available
numtextrows Number of text rows available
numvideopages Number of video pages available
numxpixels Number of pixels on the x axis
numypixels Number of pixels on the y axis
────────────────────────────────────────────────────────────────────────────
* Possible values for the mode, adapter, and monitor items are listed in the
GRAPH.H file.
The _getvideoconfig function initializes these values. Most of the values
are self-explanatory. For example, if numxpixels holds 640, the current
video mode contains 640 horizontal pixels, numbered 0 - 639.
The READVC.C example program below illustrates how to initialize and examine
a videoconfig structure:
/* READVC.C -- Reads the videoconfig structure */
#include <graph.h>
#include <stdio.h>
main()
{
struct videoconfig vc;
_getvideoconfig( &vc );
printf( "Text Rows = %i.\n", vc.numtextrows );
}
First, the program declares a structure vc of type videoconfig. Next, it
calls _getvideoconfig to initialize the structure. Finally, it prints a
member of the structure.
10.1.4 Maximizing Resolution or Color
Two symbolic constants are new to Microsoft C 6.0: _MAXRESMODE and
_MAXCOLORMODE. The first selects the highest possible resolution for the
graphics adapter and monitor currently in use. The second selects the
graphics mode with the greatest number of colors. The constants work with
all graphics adapters except the MDPA. (See Table 10.3.)
Table 10.3 Constants for Maximum Resolution and Color
╓┌────────────────┌──────────────┌───────────────────────────────────────────╖
Adapter/Monitor _MAXRESMODE _MAXCOLORMODE
────────────────────────────────────────────────────────────────────────────
CGA _HRESBW _MRES4COLOR
EGA color _HRES16COLOR _HRES16COLOR
EGA ecd 64K _ERESCOLOR _HRES16COLOR
EGA ecd 256K _ERESCOLOR _ERESCOLOR
EGA mono _ERESNOCOLOR _ERESNOCOLOR
HGC _HERCMONO _HERCMONO
MCGA _VRES2COLOR _MRES256COLOR
Adapter/Monitor _MAXRESMODE _MAXCOLORMODE
────────────────────────────────────────────────────────────────────────────
MCGA _VRES2COLOR _MRES256COLOR
MDPA Fails Fails
OCGA _ORESCOLOR _MRES4COLOR
OEGA color _ORESCOLOR _ERESCOLOR
VGA/OVGA _VRES16COLOR _MRES256COLOR
────────────────────────────────────────────────────────────────────────────
10.1.5 Selecting Your Own Video Modes
A program that will run only on a single machine with a known graphics
adapter can enter the appropriate video mode immediately. However, if you
attempt to run the program on another machine with a different adapter, it
may not run correctly, if at all.
If your program might run on a variety of computers and you prefer to select
your own video modes, initialize a videoconfig structure by calling the
_getvideoconfig function. Then check the adapter member and use a switch
statement to enter the selected video mode.
For example, suppose you know that a program will run on monochrome systems
equipped with either an EGA adapter or a Hercules adapter. To enter the
appropriate mode, use code such as this:
struct videoconfig vc;
_getvideoconfig( &vc );
switch( vc.adapter )
{
case _EGA:
_setvideomode( _ERESNOCOLOR );
break;
case _HGC:
_setvideomode( _HERCMONO );
break;
}
10.2 Mixing Colors and Changing Palettes
Depending on the graphics card installed and the video mode in effect, you
can display 2, 4, 8, 16, or 256 colors on the screen at the same time. You
specify a color by selecting a color index (sometimes called a "pixel value"
or "color attribute"). The color indexes are numbered from 0 to n-1, where n
is the number of colors in the palette.
CGA adapters offer four different palettes containing predefined fixed color
sets.
All video modes that support color offer a color palette.
EGA, MCGA, and VGA adapters have palettes that can be redefined to suit your
needs. You can change the visible color associated with any color index by
remapping to a color index a color value that describes the true color (the
amount of red, green, and blue) you want to display.
Olivetti adapters (OCGA, OEGA, and OVGA) support the standard CGA, EGA, and
VGA modes (and palettes), plus an additional Olivetti mode described in
Section 10.2.2, "Olivetti Palettes."
────────────────────────────────────────────────────────────────────────────
NOTE
The distinction between a color index and a color value is important. A
color index is always a short integer. A color value is always a long
integer. The only exception to this rule involves _setbkcolor, which uses a
color index cast to a long integer in CGA and text modes.
────────────────────────────────────────────────────────────────────────────
10.2.1 CGA Palettes
The CGA (Color Graphics Adapter) supports two color video modes: _MRES4COLOR
and _MRESNOCOLOR, which display four colors selected from one of several
predefined palettes of colors. They display these foreground colors against
a background color that can be any one of the 16 available colors. With the
CGA hardware, the palette of foreground colors is predefined and cannot be
changed. Each palette number is an integer. (See Table 10.4.)
Table 10.4 CGA Palettes in _MRES4COLOR Mode
╓┌───────────────┌─────────────┌───────────────┌─────────────────────────────╖
Color Index
Palette 1 2 3
Number
────────────────────────────────────────────────────────────────────────────
0 Green Red Brown
1 Cyan Magenta Light Gray
2 Light Green Light Red Yellow
3 Light Cyan Light Magenta White
────────────────────────────────────────────────────────────────────────────
_MRESNOCOLOR produces palettes with shades of gray on monochrome monitors.
The _MRESNOCOLOR video mode produces palettes containing various shades of
gray on monochrome monitors. However, the _MRESNOCOLOR mode displays colors
when used with a color display. Only two palettes are available in this
mode. Table 10.5 shows the colors available in the two palettes.
Table 10.5 CGA Palettes in _MRESNOCOLOR Mode
╓┌───────────────┌────────────┌─────────────┌────────────────────────────────╖
Color Index
Palette 1 2 3
Number
────────────────────────────────────────────────────────────────────────────
0 Blue Red Light Gray
1 Light Blue Light Red White
────────────────────────────────────────────────────────────────────────────
You can use the _selectpalette function only in the _MRES4COLOR,
_MRESNOCOLOR, and _ORESCOLOR graphics modes. To change palettes in other
video modes, use the _remappalette or _remapallpalette functions.
10.2.2 Olivetti(R) Palettes
Olivetti graphics adapters are found in most Olivetti computers (including
the M24, M28, M240, M280, and M380) and in the AT&T 6300 series computers.
These adapters function the same as their non-Olivetti equivalents; that is,
the OCGA, OEGA, and OVGA adapters support CGA, EGA, and VGA modes,
respectively. In addition, Olivetti adapters can enter the high resolution
_ORESCOLOR mode.
In _ORESCOLOR mode, you can choose one of 16 foreground colors by passing a
value in the range 0 -15 to the _selectpalette function. The background
color is always black.
10.2.3 VGA Palettes
Depending on the video mode currently in effect, a VGA (Video Graphics
Array) screen has 2, 16, or 256 color indexes chosen from a pool of 262,144
(256K) color values.
To name a color value, specify a level of intensity ranging from 0 - 63 for
each of the red, green, and blue components. The long integer that defines a
color value contains four bytes (32 bits):
(This figure may be found in the printed book.)
The most-significant byte should contain zeros. The two high bits in the
remaining three bytes should also be zero (these bits are ignored).
To mix a light red (pink), turn red all the way up, and mix in some green
and blue:
(This figure may be found in the printed book.)
The number 0x0020203FL represents this value in hexadecimal notation. You
can also use the following macro:
#define RGB ( r, g, b ) (0x3F3F3FL & ((long)(b) << 16 | (g) << 8 | (r)))
To create pure yellow (100% red plus 100% green) and assign it to a variable
yel, use this line:
yel = RGB( 63, 63, 0 );
For white, turn all the colors on: RGB( 63, 63, 63). For black, set all
colors to 0: RGB( 0, 0, 0 ).
Once you have the color value,
■ Call _remappalette, passing a color index and a color value.
■ Call _setcolor to make that color index the current color.
■ Draw something.
The program YELLOW.C below shows how to remap a color. It draws a rectangle
in color index 3 and then changes index 3 to the color value 0x00003F3FL
(yellow).
/* YELLOW.C -- Draws a yellow box on the screen */
/* Requires VGA or EGA */
#include <graph.h> /* graphics functions */
#include <conio.h> /* getch */
main()
{
short int index3 = 3;
long int yellow = 0x00003F3FL;
long int old3;
if( _setvideomode( _HRES16COLOR ) )
{
/* set current color to index 3*/
_setcolor( index3 );
/* draw a rectangle in that color */
_rectangle( _GBORDER, 10, 10, 110, 110 );
/* wait for a keypress */
getch();
/* change index 3 to yellow */
old3 = _remappalette( index3, yellow );
/* wait for a keypress */
getch();
/* restore the old color */
_remappalette( index3, old3 );
getch();
/* back to default mode */
_setvideomode( _DEFAULTMODE );
} else _outtext( "This program requires EGA or VGA." );
}
10.2.4 MCGA Palettes
In terms of color mixing, the MCGA (Multicolor Graphics Array) adapter is
the same as the VGA. It can display any of 256K colors. It cannot enter all
of the VGA video modes, however. It is limited to CGA modes and _VRES2COLOR
and _MRES256COLOR.
10.2.5 EGA Palettes
Mixing colors in EGA (Enhanced Graphics Adapter) is similar to the VGA
mixing described in Section 10.2.3, but there are fewer levels of intensity
for the red, green, and blue (RGB) components. In the modes that offer 64
colors, the RGB values include two bits and can range in value from 0 - 3.
The long integer that defines a color value looks like this:
(This figure may be found in the printed book.)
The bits marked 0 should be zeros; the bits marked ? are ignored. EGA
color values are defined this way to maintain compatibility with VGA color
values.
To form a pure red color value, use the constant 0x00000030L. For cyan (blue
plus green), use 0x00303000L. The RGB macro defined above for VGA color
mixing can be used as is, or you can modify it for EGA monitors:
#define EGARGB( r, g, b ) (0x303030L & ((long)(b) << 20 | (g) << 12 | (r
<< 4)))
In this macro, you would pass values in the range 0 -3 instead of 0 - 63.
For an example program that remaps a color index to a color value, see
YELLOW.C in Section 10.2.3, "VGA Palettes."
10.2.6 Symbolic Constants
The GRAPH.H file defines the following constants, which can be used as
ready-made color values for EGA and VGA adapters:
╓┌─────────────┌──────────────┌──────────────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
_BLACK _GREEN _LIGHTYELLOW
_BLUE _LIGHTBLUE _MAGENTA
_BRIGHTWHITE _LIGHTCYAN _RED
_BROWN _LIGHTGREEN _WHITE
_CYAN _LIGHTMAGENTA
_GRAY _LIGHTRED
For example, to change color index 1 to red, use the line
_remappalette( 1, _RED );
which causes any object currently drawn with color index 1 to change to red.
The default color value associated with index 1 is blue.
10.3 Specifying Points within Coordinate Systems
A coordinate system describes points on the screen in terms of their
horizontal (x) and vertical (y) positions. You specify a certain location by
providing two values that map to a unique position.
Graphics functions usually use viewport and window coordinates.
Coordinates on the physical screen never change. Only five functions, listed
in Section 10.3.1, use physical coordinates. All other graphics functions
use one of these two coordinate systems:
■ Viewport coordinates (short integers)
■ Window coordinates (double-precision floating-point numbers)
Viewports and windows can occupy all of the physical screen or just part of
it. The three coordinate systems and conventions for naming points and
regions of the screen are described below.
10.3.1 Physical Coordinates
Within the physical screen, the upper left corner is called the "origin."
The x and y coordinates for the origin are always (0, 0). The x axis extends
in the positive direction left to right, while the y axis extends in the
positive direction top to bottom.
For example, the video mode _VRES16COLOR has a resolution of 640 x 480,
which means the x axis contains the values 0 - 639 (left to right), and the
y axis contains 0 - 479 (top to bottom). (See Figure 10.1.)
(This figure may be found in the printed book.)
Only five functions use physical coordinates: _setcliprgn, _setvieworg,
_setviewport, _getviewcoord, and _getphyscoord.
The _setcliprgn function establishes a "clipping region." Attempts to draw
inside the region succeed, while attempts to draw outside the region are
clipped (ignored). When you first enter a graphics mode, the clipping region
defaults to the entire screen.
The _setvieworg function changes the current location of the origin. When a
program first enters a graphics mode, the physical origin and the viewport
origin are in the upper left corner. The following code moves the viewport
origin to the physical screen location (50, 100):
_setvieworg( 50, 100 );
The effect on the screen is illustrated in Figure 10.2. Note that the number
of pixels remains constant, but the range of legal x values changes from a
range of 0 to 639 (physical screen) to -50 to 589. The legal y values change
as well.
(This figure may be found in the printed book.)
All graphics functions are affected by the new origin, including _arc,
_ellipse, _lineto, _moveto, _outgtext, _pie, and _rectangle.
The third function that uses physical coordinates is _setviewport, described
below, which establishes the boundaries of the current viewport.
10.3.2 Viewport Coordinates
The default viewport coordinate system is identical to the physical screen
coordinate system. The _setviewport function creates a new viewport within
the boundaries of the physical screen. A standard viewport has two
distinguishing features:
■ The origin of a viewport initially lies in the upper left corner of
the viewport, not the upper left corner of the physical screen.
■ The clipping region matches the outer boundaries of the viewport.
Graphics output functions require viewport or window coordinate values.
In other words, the _setviewport function does the same thing as would two
separate calls to _setvieworg and _setcliprgn. All graphics output functions
require values that are either viewport coordinates or window coordinates.
For example,
_setviewport( 50, 50, 200, 100 );
creates the viewport illustrated in Figure 10.3. The values passed to the
_setviewport function are physical screen locations of opposite corners.
After the viewport is created, the viewport origin lies in the upper left
corner.
(This figure may be found in the printed book.)
10.3.3 Window Coordinates
The _setwindow function allows you to use floating-point coordinates instead
of integers. More importantly, it scales the screen coordinates to almost
any size within the current viewport. Window functions take double-precision
arguments and have names that end with the suffixes _w or _wxy. The function
_lineto_w is the window-coordinate equivalent of the viewport function
_lineto.
To create a window for charting 12 months of average temperatures ranging
from - 40 to 100, use this line:
_setwindow( TRUE, 1.0, -40.0, 12.0, 100.0 );
The first argument is the invert flag, which puts the lowest y value at the
bottom of the screen instead of the top. The minimum and maximum coordinates
follow. The new organization of the screen is shown in Figure 10.4.
(This figure may be found in the printed book.)
If you plot a point with _setpixel_w or draw a line with _lineto_w, the
values are automatically scaled to the established window.
Window-coordinate graphics provide a lot of flexibility. You can fit an axis
into a small range (such as 151.25 to 151.45) or into a large range (-50,000
to 80,000), depending on the type of data to be graphed. In addition, by
changing the window coordinates and redrawing a figure, you can create the
effects of zooming in or panning across a figure.
10.3.4 Screen Locations
A coordinate system needs two values (a horizontal and a vertical position)
to describe the location of a point on the screen. There are times, however,
when it is more convenient to use one variable instead of two.
Some graphics functions require you to pass the location of a point on the
screen. Others return a value that represents a location. The GRAPH.H file
defines two structures that allow you to refer to a point with a single
variable.
■ An xycoord structure contains two short integers called xcoord and
ycoord for use in viewport graphics.
■ A _wxycoord structure contains two doubles called wx and wy for use in
window-coordinate graphics.
For example, you pass four doubles to the _rectangle_w function: an x and y
position for the upper left corner of the window and an x and y position for
the lower right corner. The _rectangle_wxy function takes two _wxycoord
structures.
10.3.5 Bounding Rectangles
Certain figures such as arcs and ellipses are centered within a "bounding
rectangle," specified by two points that define the opposite corners of the
rectangle. The center of the rectangle becomes the center of the figure, and
the rectangle's borders determine the size of the figure. Figure 10.5 shows
start and end vectors and a bounding rectangle in which a pie shape has been
drawn with the _pie function. The first two sets of coordinates are x1, y1,
x2, and y2. They define the boundaries of the rectangle. The pie shape needs
two other points, x3, y3, x4, and y4, which indicate the starting and ending
lines.
(This figure may be found in the printed book.)
10.3.6 The Pixel Cursor
A "pixel cursor" is a location on the screen. The _moveto function positions
this cursor at a given spot. Nothing visible appears. If you call _lineto, a
line is drawn from the current pixel cursor to another point. The _lineto
function also changes the location of the pixel cursor. When you call
_outgtext to display fonted text, the characters are drawn at the current
pixel cursor location.
To draw a series of connected lines, call _lineto several times.
The _getcurrentposition function returns the cursor location in an xycoord
structure.
10.4 Graphics Functions
This section lists the functions that work in one or more bit-mapped
graphics modes. Most of these functions are present in several forms. The
function names that end with _w use double values as arguments and the
window coordinate system. Functions that end with _wxy use the window
coordinate system and a _wxycoord structure to define the coordinates.
Functions with no suffix use the viewport coordinate system.
10.4.1 Controlling Video Modes
The functions described below affect the current video mode, coordinate
systems, clipping regions, viewports, and windows. For more information, use
on-line help.
_clearscreen - Erases the text or graphics screen and fills it with the
current background color (note that setting the video mode automatically
clears the screen). Pass one of the constants _GCLEARSCREEN, _GVIEWPORT, or
_GWINDOW. No return value.
_getphyscoord - Converts viewport coordinates to physical coordinates. Pass
an x and y coordinate from the viewport. The function returns an xycoord
structure, which includes an x and a y position from the physical screen.
_getvideoconfig - Obtains the status of the current graphics environment.
Pass it the address of a structure of type _videoconfig. See Section 10.1.3.
"Reading the videoconfig Structure."
_getviewcoord - Converts physical coordinates to viewport coordinates. Pass
two integers: an x and y coordinate. The function returns an xycoord
structure containing the equivalent position within the viewport.
_getviewcoord_w - Converts window coordinates to viewport coordinates. Pass
two doubles that name points within the window. Returns the equivalent
viewport coordinates as an xycoord structure.
_getviewcoord_wxy - Converts window coordinates to viewport coordinates in
an xycoord structure. Pass a _wxycoord structure.
_getwindowcoord - Converts viewport coordinates to window coordinates. Pass
two integers representing viewport coordinates. Returns a _wxycoord
structure.
_setcliprgn - Limits graphic output to part of the screen, called the
"clipping region." Pass four values: the x and y coordinate of the upper
left corner (on the physical screen) and the coordinates of the lower right
corner. The default clipping region is the entire screen. See Section
10.3.1, "Physical Coordinates."
_setvideomode - Selects an operating mode for the display screen. Pass a
constant, such as _HRES16COLOR. Returns 0 if the video mode selected is not
supported by the hardware. See Section 10.1.2, "Setting a Video Mode."
_setvideomoderows - Sets the video mode and the number of rows for text
operations. Pass two values: a video mode and the desired number of text
rows (25, 30, 43, 50, or 60). Pass the symbolic constant _MAXTEXTROWS to get
the largest available number of rows. Returns the number of rows or 0 if
unsuccessful.
_setvieworg - Repositions the viewport origin. Pass an x and y position: the
physical screen location that will become the new origin. Returns the
previous origin in an xycoord structure.
_setviewport - Creates a viewport, including a clipping region and a new
origin in the upper left corner of the viewport. Subsequent calls to
graphics routines will be limited to the viewport area. Pass four short
integers that indicate the physical screen locations of the x and y
coordinates in the upper left and lower right corners of the viewport. No
return value.
_setwindow - Defines a window coordinate system. Pass five values: a short
invert flag (TRUE or FALSE) and four doubles that represent the extreme
values in the upper left and lower right portions of the current viewport.
See Section 10.3.3, "Window Coordinates."
10.4.2 Changing Colors
The functions below control colors and color palettes. For an introduction
to this topic, see Section 10.2, "Mixing Colors and Changing Palettes." For
function prototypes and more information, consult on-line help.
_getbkcolor - Reports the current background color as a long integer. In
EGA, MCGA, and VGA video modes, this is a color value. In CGA and text
modes, it is a color index.
_getcolor - Returns the current color index.
_remapallpalette - Assigns new color values to all color indexes. Pass a
pointer to an array of color values. Returns 0 if unsuccessful.
_remappalette - Assigns a color value to a specific color index. Pass a
short color index and a long color value (which specifies the amount of red,
green, and blue). Returns the previous color value for that index or -1 if
unsuccessful. See Section 10.2.1, "CGA Palettes."
_selectpalette - Selects a predefined palette. This function applies only to
the CGA video modes _MRES4COLOR and _MRESNOCOLOR and the Olivetti graphics
mode _ORESCOLOR. To change palettes in other color video modes, use
_remappalette instead. Pass a short integer in the range 0 - 4 for CGA, or 0
-15 for Olivetti mode. Returns the value of the previous palette.
_setbkcolor - Sets the current background color. Always pass a long integer.
In EGA, MCGA, and VGA modes, this value is a color value. In CGA and text
modes, this is a color index cast to a long integer. Returns the old
background color or -1 if unsuccessful.
_setcolor - Sets the color index to be used for graphic output. It affects
later calls to functions such as _arc, _ellipse, _floodfill, _lineto,
_outgtext, _outtext, _pie, _rectangle, and _setpixel. Returns the previous
color or -1 if unsuccessful.
10.4.3 Drawing Points, Lines, and Shapes
The functions described below draw points, lines, and shapes. For a
definition of bounding rectangle and pixel cursor, see Sections 10.3.5 and
10.3.6.
_arc - Draws an elliptical arc. Pass eight short integers: four pairs of x
and y coordinates. The first two pairs are the corners of the bounding
rectangle. The third and fourth are the starting and ending points of the
arc. Returns 0 if unsuccessful.
_arc_wxy - Draws an arc within the window. Pass four wxycoord structures.
The first two are the corners of the bounding rectangle. The third and
fourth are the starting and ending points of the arc. Returns 0 if
unsuccessful.
_ellipse - Draws an ellipse or a circle. Pass a short fill flag ( _GBORDER
or _GFILLINTERIOR) and four short integers representing the corners of the
bounding rectangle. Returns 0 if unsuccessful.
_ellipse_w - Draws an ellipse or a circle within a window. Pass a short fill
flag ( _GBORDER or _GFILLINTERIOR) and four doubles representing the corners
of the bounding rectangle. Returns 0 if unsuccessful.
_ellipse_wxy - Draws an ellipse or a circle. Pass a short fill flag (
_GBORDER or _GFILLINTERIOR) and two _wxycoord structures representing the
two corners of the bounding rectangle. Returns 0 if unsuccessful.
_getcurrentposition - Returns the current pixel cursor position in viewport
coordinates as an xycoord structure. The current position can be changed by
_arc, _lineto, and _moveto. The default position is the center of the
viewport.
_getcurrentposition_w - Returns the current position of the pixel cursor as
a _wxycoord structure containing the x and y coordinates. Pass nothing.
_getpixel - Returns a pixel's color index. Pass a short x and y coordinate
(in viewport coordinates). If the point is outside the clipping region, the
function returns -1.
_getpixel_w - Returns a pixel's color index. Pass two doubles: an x and y
coordinate.
_lineto - Draws a line from the current pixel cursor position to a specified
point. Pass a short x and a short y position. Returns 0 if unsuccessful.
_lineto_w - Draws a line from the current pixel position to a specified
window coordinate point. Pass a double x and y position. Returns 0 if
unsuccessful.
_moveto - Moves the pixel cursor to a specified point (with no graphic
output). Pass an x and y position. Returns the coordinates of the previous
position in an xycoord structure.
_moveto_w - Moves the pixel cursor to a specified point in a window. Pass
two doubles: an x and a y coordinate. Returns the previous position as a
_wxycoord structure.
_ pie - Draws a figure shaped like a pie slice. Pass a short fill flag and
eight short integers. The first four describe the bounding rectangle. The
final four represent the starting vector and ending vector. Returns 0 if
unsuccessful.
_ pie_wxy - Draws a pie-slice figure within a window. Pass a short fill flag
and four _wxycoord structures. The first two describe the bounding
rectangle. The second two represent the starting vector and ending vector.
Returns 0 if unsuccessful.
_rectangle - Draws a rectangle in the current line style. Pass a short fill
flag ( _GFILLINTERIOR or _GBORDER) and four short integers: the x and y
coordinates of opposite corners. Returns 0 if unsuccessful.
_rectangle_w - Draws a rectangle in the current line style. Pass a short
fill flag ( _GFILLINTERIOR or _GBORDER) and four doubles: the x and y window
coordinates of opposite corners. Returns 0 if unsuccessful.
_rectangle_wxy - Draws a rectangle in the current line style. Pass a short
fill flag ( _GFILLINTERIOR or _GBORDER) and two _wxycoord structures
describing the x and y coordinates of opposite corners. Returns 0 if
unsuccessful.
_setpixel - Sets a pixel to the current color (which is selected by
_setcolor). Pass it integer x and y coordinates. Returns the previous value
of the pixel or -1 if unsuccessful.
_setpixel_w - Sets a pixel to the current color (which is selected by
_setcolor). Pass it double x and y coordinates describing a position within
the window. Returns the previous value of the pixel or -1 if unsuccessful.
10.4.4 Defining Patterns
The following functions control the style in which straight lines are drawn
and the fill pattern used for solid shapes. For more information, use
on-line help.
_floodfill - Fills a bounded shape with the fill pattern set by _setfillmask
in the current color established by _setcolor. Pass an x and y coordinate
and a boundary color (the color index that marks the edge of the shape to be
filled). Returns 0 if unsuccessful.
_floodfill_w - Fills a bounded shape with the fill pattern set by
_setfillmask. Pass doubles that describe an x and y position within the
window and a boundary color (the color index that marks the edge of the
shape to be filled). Returns 0 if unsuccessful.
_getfillmask - Returns the address of the current fill mask, an
eight-character array, or 0 if the fill mask is not currently defined.
_getlinestyle - Returns the line style, a short integer whose bits
correspond to the screen pixels turned on or off within a line.
_setfillmask - Sets the current fill mask used by _floodfill and functions
that draw solid shapes (_ellipse, _pie, and _rectangle). Pass the address of
an array of eight unsigned characters, where each bit represents a pixel.
The pixels are drawn in the current color. No return value.
_setlinestyle - Sets the current style, which is used to draw the straight
lines within _lineto, _rectangle, and _pie. Pass an unsigned short integer
within which the bits correspond to the pixels on screen. For example,
0xFFFF represents a solid line, 0xAAAA is a dotted line, and 0xF0F0 is
dashed.
10.4.5 Manipulating Images
The functions described below can be used to create animated graphics. The
_getimage and _putimage functions act like a rubber stamp; after capturing a
shape, you can make copies anywhere on the screen.
_getimage - Stores a screen image in memory. Pass four integers (the
coordinates of the bounding rectangle) and a pointer to a storage buffer.
Call _imagesize to find out how much memory is required. No return value.
_getimage_w - Stores a screen image in memory. Pass four doubles (the
coordinates of the bounding rectangle) and a pointer to a storage buffer.
Call _imagesize_w to find out how much memory is required. No return value.
_getimage_wxy - Same as _getimage_w, but you pass two _wxycoord structures
and a pointer to memory.
_imagesize - Returns a long integer representing the size of an image in
bytes. Call this function in preparation for a call to _getimage. Pass four
integers: the x and y coordinates of opposite corners of the portion of the
screen to be saved.
_imagesize_w - Returns the size of an image in bytes in preparation for a
call to _getimage_w and _putimage_w. Pass four doubles: the x and y window
coordinates of opposite corners of the portion of the screen to be saved.
_imagesize_wxy - Same as _imagesize_w, but you pass two _wxycoord
structures.
_putimage - Retrieves an image from memory and displays it on the active
screen page. The image should previously have been saved to memory with
_getimage. Pass two short integers (coordinates where the image is to be
placed), a pointer to the image, and a short integer indicating what kind of
action to take: _GAND, _GOR, _GPRESET, _GPSET, or _GXOR. No return value.
_putimage_w - Displays an image from memory within a window. The image
should previously have been saved to memory with _getimage_w. Pass two
doubles (coordinates where the image is to be placed), a pointer to the
image, and a short integer indicating what kind of action to take: _GAND,
_GOR, _GPRESET, _GPSET, or _GXOR. No return value.
10.5 Using Graphic Fonts
A "font" is a collection of stylized text characters. Each font consists of
a typeface with several type sizes.
A "typeface" is the name of the displayed text─Courier, for example, or
Roman. The list on the next page shows six of the typefaces available with
the Microsoft C font library.
"Type size" measures the screen area occupied by individual characters in
units of screen pixels. For example, "Courier 12 x 9" denotes text of
Courier typeface, with each character occupying a screen area of 12 vertical
pixels by 9 horizontal pixels.
A font's spacing can be fixed or proportional. "Fixed" means that all
characters have the same width in pixels. "Proportional" means the width
varies. An i, for example, is thinner than an M.
The Microsoft C font functions use two methods to create fonts. The first
technique generates Courier, Helv, and Tms Rmn fonts through a "bit-mapping"
(or "raster-mapping") technique. Bit-mapping defines character images with
binary data. Each bit in the map corresponds to a screen pixel. If a bit is
1, its associated pixel is set to the current screen color.
The second method creates the remaining three type styles─Modern, Script,
and Roman─as "vector-mapped" fonts. Vector-mapping represents each character
in terms of lines and arcs.
Each method has advantages and disadvantages. Bit-mapped characters are more
completely formed since the pixel mapping is predetermined. However, they
cannot be scaled. Vector-mapped text can be scaled to any size, but the
characters tend to lack the solid appearance of the bit-mapped characters.
The following list shows six sample typefaces:
(This figure may be found in the printed book.)
Table 10.6 lists available sizes for each font. Note that the bit-mapped
fonts come in preset sizes as measured in pixels. The vector-mapped fonts
can be scaled to any size.
Table 10.6 Typefaces and Type Sizes in the C Library
╓┌─────────┌────────┌─────────────────┌──────────────────────────────────────╖
Typeface Mapping Size (in pixels) Spacing
────────────────────────────────────────────────────────────────────────────
Courier Bit 10 x 8, 12 x 9, Fixed
15 x 12
Helv Bit 10 x 5, 12 x 7, Proportional
15 x 8, 18 x 9,
22 x 12, 28 x 16
Tms Rmn Bit 10 x 5, 12 x 6, Proportional
Typeface Mapping Size (in pixels) Spacing
────────────────────────────────────────────────────────────────────────────
Tms Rmn Bit 10 x 5, 12 x 6, Proportional
15 x 8, 16 x 9,
20 x 12, 26 x 16
Modern Vector Scaled Proportional
Script Vector Scaled Proportional
Roman Vector Scaled Proportional
────────────────────────────────────────────────────────────────────────────
10.5.1 Using the C Font Library
Data for both bit-mapped and vector-mapped fonts reside in .FON files. For
example, the files MODERN.FON, ROMAN.FON, and SCRIPT.FON hold data for the
three vector-mapped fonts.
You can use Microsoft Windows .FON files.
The Microsoft C .FON files are identical to the .FON files used in the
Microsoft Windows operating environment. If you have access to Windows, you
can use any of its .FON files with Microsoft C font functions. In addition,
several vendors offer software that creates or modifies .FON files, allowing
you to design your own fonts.
Your programs should follow these three steps to display fonted text:
1. Register the fonts.
2. Set the current font from the register.
3. Display text using the current font.
The following sections describe each of the three steps in detail. An
example program in Section 10.5.5 demonstrates these steps.
10.5.2 Registering the Fonts
The fonts must first be organized into a list in memory, a process called
"registering." Register fonts by calling the function _registerfonts. This
function reads header information from specified .FON files, building a list
of file information but not reading any mapping data from the files.
The GRAPH.H file prototypes the _registerfonts function as
short far _registerfonts( unsigned char far * );
The argument points to a string containing a file name. The file name is the
name of the .FON file for the desired font. The file name can include wild
cards, allowing you to register several fonts with one call to
_registerfonts.
If it successfully reads one or more .FON files, _registerfonts returns the
number of fonts. If the function fails, it returns a negative error code.
10.5.3 Setting the Current Font
Call the function _setfont to select a current font. This function checks to
see if the requested font is registered, then reads the mapping data from
the appropriate .FON file. A font must be registered and marked current
before your program can display text in that font.
The GRAPH.H file prototypes the_setfonts function as
short far _setfont( unsigned char far * );
The function's argument is a pointer to a character string. The string
consists of letter codes that describe the desired font, as outlined here:
Option Code Meaning
────────────────────────────────────────────────────────────────────────────
b The best fit from the registered fonts.
This option instructs _setfont to accept
the closest-fitting font if a font of
the specified size is not registered.
If at least one font is registered, the
b option always sets a current font. If
you do not specify the b option and an
exact matching font is not registered,
the
_setfont function will fail. In this
case, any existing current font remains
current. Refer to on-line help for a
description of error codes returned by
_setfont.
The _setfont function uses four criteria
for selecting the best fit. In
descending order of precedence, the four
criteria are pixel height, typeface,
pixel width, and spacing (fixed or
proportional). If you request a
vector-mapped font, _setfont sizes the
font to correspond with the specified
pixel height and width. If you request a
raster-mapped (bit-mapped) font,
_setfont chooses the closest available
size. If the requested type size for a
raster-mapped font fits exactly between
two registered fonts, the smaller size
takes precedence.
f Fixed-spaced font.
hy Character height, where y is the height
in pixels.
nx Font number x, where x is less than or
equal to the value returned by
_registerfonts. For example, the option
n3 makes the third registered font
current, if three or more fonts are
registered.
p Proportional-spaced font.
r Raster-mapped (bit-mapped) font.
t`fontname' Typeface of the font in single quotes.
The fontname string is one of the
following:
courier modern helv script tms rmn roman
Note the space in tms rmn. Additional
font files use other names for fontname.
Refer to the vendor's documentation for
these names.
v Vector-mapped font.
wx Character width, where x is the width in
pixels.
Option codes are not case sensitive and can be listed in any order. You can
separate codes with spaces or any other character that is not a valid option
code. The _setfont function ignores all invalid codes.
The _setfont function updates a data area with parameters of the current
font. The data area is in the form of a structure, defined in GRAPH.H as
follows:
struct _fontinfo
{
int type; /* set = vector,clear = bit map */
int ascent; /* pix dist from top to base */
int pixwidth; /* character width in pixels */
int pixheight; /* character height in pixels */
int avgwidth; /* average character width */
char filename[81]; /* file name including path */
char faceName[32]; /* font name */
};
If you want to retrieve the parameters of the current font, call the
function _getfontinfo.
10.5.4 Displaying Text
The last step, displaying text, consists of two parts. First you must select
a screen position for the text with the graphics function _moveto. Then
display fonted text at that position with the function _outgtext. The
_moveto function takes pixel coordinates as arguments. The coordinates
locate the top left of the first character in the text string.
10.5.5 A Sample Program
The program SAMPLER.C displays sample text in all the available fonts, then
exits when a key is pressed. Make sure the .FON files are in the current
directory before running the program.
/* SAMPLER.C: Displays sample text in various fonts. */
#include <stdio.h>
#include <conio.h>
#include <stdlib.h>
#include <graph.h>
#include <string.h>
#define NFONTS 6
main()
{
static unsigned char *text[2*NFONTS] =
{
"COURIER", "courier",
"HELV", "helv",
"TMS RMN", "tms rmn",
"MODERN", "modern",
"SCRIPT", "script",
"ROMAN", "roman"
};
static unsigned char *face[NFONTS] =
{
"t'courier'",
"t'helv'",
"t'tms rmn'",
"t'modern'",
"t'script'",
"t'roman'"
};
static unsigned char list[20];
struct videoconfig vc;
int mode = _VRES16COLOR;
register i;
/* Read header info from all .FON files in
* current directory
*/
if( _registerfonts( "*.FON" ) < 0 )
{
_outtext( "Error: can't register fonts" );
exit( 0 );
}
/* Set highest available video mode */
if( _setvideomode( _MAXRESMODE ) == 0 )
exit ( 0 );
/* Copy video configuration into structure vc */
_getvideoconfig( &vc );
/* Display six lines of sample text */
for( i = 0; i < NFONTS; i++ )
{
strcpy( list, face[i] );
strcat( list, "h30w24b" );
if( _setfont( list ) >= 0 )
{
_setcolor( i + 1 );
_moveto( 0, (i * vc.numypixels) / NFONTS );
_outgtext( text[i * 2] );
_moveto( vc.numxpixels / 2,
(i * vc.numypixels) / NFONTS );
_outgtext( text[(i * 2) + 1] );
}
else
{
_setvideomode( _DEFAULTMODE );
_outtext( "Error: can't set font" );
exit( 0 );
}
}
getch();
_setvideomode( _DEFAULTMODE );
/* Return memory when finished with fonts */
_unregisterfonts();
exit( 0 );
}
10.5.6 Using Fonts Effectively
Displaying fonts is simply another form of graphics; using fonts effectively
requires little programming effort. Still, there are a few things to watch:
■ Remember that the video mode should be set only once. If you generate
an image with presentation graphics and want to add text to it, do not
reset the video mode prior to calling the font routines. Doing so will
blank the screen, destroying the original image.
■ The _setfont function reads specified .FON files to obtain mapping
data for the current font. Each call to _setfont causes a disk access
and overwrites the old font data in memory. If you want to show text
of different styles on the same screen, display all text of one font
before moving on to the others. Minimizing the number of calls to
_setfont saves time spent in disk I/O and memory reloads.
■ When your program finishes using the fonts library, you may want to
free the memory occupied by the register list by calling
_unregisterfonts. This function frees the memory allocated by
_registerfonts. The register information for each type size of each
font takes up approximately 140 bytes of memory.
■ Aesthetic suggestions for the printed page also apply to screen text.
Typefaces are more effective when they do not compete with each other
for attention. Restricting the number of styles per screen to one or
two generally results in a more pleasing, less cluttered image.
Chapter 11 Creating Charts and Graphs
────────────────────────────────────────────────────────────────────────────
The low-level graphics functions described in Chapter 10, "Communicating
with Graphics," draw points, lines, and shapes. Although it is possible to
use them to generate charts and graphs, an additional set of high-level
graphics functions is better suited to this task.
"Presentation graphics" is a set of high-level functions that displays
presentation-quality graphics. These functions transform numeric data into
pie charts, bar and column charts, line graphs, and scatter diagrams.
This chapter describes how to use presentation graphics.
11.1 Overview of Presentation Graphics
The presentation graphics library PGCHART.LIB contains 22 functions. They
are listed in Table 11.1 for convenient reference.
Table 11.1 Presentation Graphics Function
╓┌───────────────────┌─────────────────────┌─────────────────────────────────╖
Primary Functions Secondary Functions
────────────────────────────────────────────────────────────────────────────
_pg_chart _pg_analyzechart _pg_hlabelchart
Primary Functions Secondary Functions
────────────────────────────────────────────────────────────────────────────
_pg_chart _pg_analyzechart _pg_hlabelchart
_pg_chartms _pg_analyzechartms _pg_resetpalette
_pg_chartpie _pg_analyzepie _pg_resetstyleset
_pg_chartscatter _pg_analyzescatter _pg_setchardef
_pg_chartscatterms _pg_analyzescatterms _pg_setpalette
_pg_defaultchart _pg_getchardef _pg_setstyleset
_pg_initchart _pg_getpalette _pg_vlabelchart
_pg_getstyleset
────────────────────────────────────────────────────────────────────────────
The seven primary functions initialize variables and display selected chart
types.
In most cases, you will be using only seven "primary functions." These
functions initialize variables and display selected chart types. The 15
"secondary functions" of presentation graphics do not directly display
charts. Most of them retrieve or set data in the presentation graphics chart
environment.
Among the secondary functions are the "analysis functions," identified by
the prefix _pg_analyze. These five functions calculate default values that
pertain to a given chart type and data set. Calling an analysis function has
the same effect as calling a corresponding primary function, except that the
chart is not displayed. This allows you to pass on to the library the burden
of calculating values. You can then make modifications to the resulting
values and call a primary routine to display the chart.
Use the _pg_hlabelchart and _pg_vlabelchart functions to display text that
is not part of a title or axis label on your chart. These functions enable
you to attach notes or other messages to your chart.
11.2 Parts of a Graph
This section describes the terms used to refer to the different kinds of
information that can be plotted. The various types of charts and graphs are
also defined.
Data Series
Data that are related by a common idea or purpose constitute a "series." For
example, the prices of a futures commodity over the course of a year form a
single series of data. The volume forms a second data series.
When you include several series in one chart, characteristics such as color
and pattern can help distinguish one from another. You can more readily
differentiate series on a color monitor than you can on a monochrome
monitor. The number of series that can appear on the same chart depends on
the chart type and the number of available colors.
Categories
"Categories" are nonnumeric data. A set of categories forms a frame of
reference for the comparison of numeric data. For example, the months of the
year are categories against which numeric data such as inches of rainfall
can be plotted.
Regional sales provide another example. A chart can compare a company's
sales in different parts of the country. Each region forms a category.
Values
"Values" are numeric data. Sales, stock prices, air temperatures, and
populations are all series of values that can be plotted against categories
or against other values.
Presentation graphics allows you to overlay different series of value data
on a single graph. For example, average monthly temperatures or monthly
sales of heating oil during different years─or a combination of temperatures
and sales─can be plotted together on the same graph.
Pie Charts
"Pie charts" are used to represent data by showing the relationship of each
part to the whole. A good example is a company's annual budget. A pie chart
allows you to view each area of revenue or spending by its relative size
within the context of the entire company budget.
Presentation graphics can display either a standard or an "exploded" pie
chart. The exploded view shows the pie with one or more pieces separated for
emphasis. You can label each slice of a pie chart with a percentage figure
if you wish.
Bar and Column Charts
As the name implies, a "bar chart" shows data as horizontal bars. Bar charts
show comparisons among items rather than absolute value.
"Column charts" are vertical bar charts. Column charts are frequently used
to show variations over a period of time, since they suggest time flow
better than a bar chart.
Line Graphs
"Line graphs" illustrate trends or changes in data. They show how a series
of values varies against a particular category─for example, average
temperatures throughout one year.
Traditionally, line graphs show a collection of data points connected by
lines. Presentation graphics can also plot points that are not connected by
lines.
Scatter Diagrams
A "scatter diagram" is the only type of graph available in presentation
graphics that directly compares values with values. A scatter diagram simply
plots points.
Scatter diagrams illustrate the relationship between numeric values in
different groups of data. They graphically show trends and correlations not
easily detected from rows and columns of raw numbers.
Scatter diagrams are most useful with large amounts of data. Consider, for
example, the relationship between personal income and family size. If you
poll one thousand wage earners for their income and family size, you have a
scatter diagram with one thousand points. If you combine your results so
that you are left with one average income for each family size, you have a
line graph.
Axes
All presentation graphics charts except pie charts are displayed with two
perpendicular reference axes. The vertical, or y, axis runs from top to
bottom of the chart and is placed against the left side of the screen. The
horizontal, or x, axis runs from left to right across the bottom of the
screen.
The chart type determines the axis used for category data and the axis for
value data.
The x axis is the category axis for column and line charts and the value
axis for bar charts. The y axis is the value axis for column and line charts
and the category axis for bar charts.
Chart Windows
The "chart window" defines that part of the screen on which the chart is
drawn. By default, the window fills the entire screen, but presentation
graphics allows you to resize the window for smaller graphs. By redefining
the chart window to different screen locations, you can view separate graphs
together on the same screen.
Data Windows
While the chart window defines the entire graph including axes and labels,
the "data window" defines only the actual plotting area. This is the portion
of the graph to the right of the y axis and above the x axis. You cannot
specify or adjust the size of the data window. Presentation graphics
automatically determines its size based on the dimensions of the chart
window.
Chart Styles
Each of the five types of presentation graphics charts can appear in two
different "chart styles," as described in Table 11.2.
Table 11.2 Presentation Graphics Chart Styles
╓┌───────────┌───────────────────┌───────────────────────────────────────────╖
Chart Type Chart Style #1 Chart Style #2
────────────────────────────────────────────────────────────────────────────
Pie With percentages Without percentages
Bar Side-by-side Stacked
Column Side-by-side Stacked
Line Points with lines Points only
Scatter Points with lines Points only
────────────────────────────────────────────────────────────────────────────
Bar and column charts have only one style when displaying a single series of
data. The styles "side-by-side" and "stacked" are applicable when more than
one series appears on the same chart. The first style arranges the bars or
columns for the different series side by side, showing relative heights or
lengths. The stacked style, illustrated for a column chart in Figure 11.3,
emphasizes relative sizes between bars or columns.
Legends
Legends help identify individual data series.
When displaying more than one data series on a chart, presentation graphics
uses different colors, line styles, or patterns to differentiate them.
Presentation graphics also can display a "legend" that labels the different
series of a chart. For a pie chart, the legend labels individual slices of
the pie.
A sample of the color and pattern used to graph the series appears next to
the series label. This identifies the set of data to which the labels
belong.
You may change the font displayed by calling the _registerfonts and _setfont
functions (see Section 10.5 for more information about using fonts). If you
don't select a font, presentation graphics defaults to an internal font.
11.3 Writing a Presentation Graphics Program
To write a C program that uses presentation graphics, follow these steps:
1. Include the required header files, GRAPH.H and PGCHART.H, as well as
any other header files your program may need.
2. Set the video mode to a graphics mode. See Chapter 10, "Communicating
with Graphics," for a description of video modes.
3. Initialize the presentation graphics chart environment. Presentation
graphics places charting parameters in data structures. The amount of
initialization that must be done by your program depends on how
extensively it relies on the defaults.
4. Assemble the plot data. Data can be collected in a variety of ways: by
calculating it elsewhere in the program, reading it from files, or
entering it from the keyboard. All plot data must be assembled in
arrays because the presentation graphics functions locate them through
pointers.
5. Call presentation graphics functions to display the chart. Pause while
the chart is on the screen.
6. Reset the video mode. When your program detects the signal to
continue, it should reset the video to its original (default) mode.
After compiling the program, link it to the library modules PGCHART.LIB and
GRAPHICS.LIB.
The sample programs in Sections 11.3.1-11.3.3 use 5 of the 22 presentation
graphics functions: _pg_initchart, _pg_defaultchart, _pg_chartpie,
_pg_chart, and _pg_chartscatter. Each program is commented so that you can
recognize the steps given in this section.
11.3.1 Pie Chart
The following program uses presentation graphics to display a pie chart for
monthly sales of orange juice over a year. The chart, which is shown in
Figure 11.1, remains on the screen until a key is pressed.
/* PIE.C: Create sample pie chart. */
#include <conio.h>
#include <string.h>
#include <graph.h>
#include <pgchart.h>
#define MONTHS 12
typedef enum {FALSE, TRUE} boolean;
float far value[MONTHS] =
{
33.0, 27.0, 42.0, 64.0,106.0,157.0,
182.0,217.0,128.0, 62.0, 43.0, 36.0
};
char far *category[MONTHS] =
{
"Jan", "Feb", "Mar", "Apr",
"May", "Jun", "Jly", "Aug",
"Sep", "Oct", "Nov", "Dec"
};
short far explode[MONTHS] = {0};
main()
{
chartenv env;
int mode = _VRES16COLOR;
/* Set highest video mode available */
if( _setvideomode( _MAXRESMODE ) == 0 )
exit( 0 );
/* Initialize chart library and a default pie chart */
_pg_initchart();
_pg_defaultchart( &env, _PG_PIECHART, _PG_PERCENT );
/* Add titles and some chart options */
strcpy( env.maintitle.title, "Good Neighbor Grocery" );
env.maintitle.titlecolor = 6;
env.maintitle.justify = _PG_RIGHT;
strcpy( env.subtitle.title, "Orange Juice Sales" );
env.subtitle.titlecolor = 6;
env.subtitle.justify = _PG_RIGHT;
env.chartwindow.border = FALSE;
/* Parameters for call to _pg_chartpie are:
*
* env - Environment variable
* category - Category labels
* value - Data to chart
* explode - Separated pieces
* MONTHS - Number of data values
*/
if( _pg_chartpie( &env, category, value,
explode, MONTHS ) )
{
_setvideomode( _DEFAULTMODE );
_outtext( "Error: can't draw chart" );
}
else
{
getch();
_setvideomode( _DEFAULTMODE );
}
return( 0 );
}
(This figure may be found in the printed book.)
11.3.2 Bar, Column, and Line Charts
The code for the PIE.C program needs only minor alterations to produce bar,
column, and line charts for the same data:
■ Replace the call to _pg_chartpie with _pg_chart. This function
produces bar, column, and line charts depending on the value of the
second argument for _pg_defaultchart.
■ Give new arguments to _pg_defaultchart that specify chart type and
style.
■ Assign titles for the x axis and y axis in the structure env.
■ Remove references to array explode, which is applicable only to pie
charts.
The following example produces a bar chart for the store owner's data. The
result is shown in Figure 11.2.
/* BAR.C: Create sample bar chart. */
#include <conio.h>
#include <string.h>
#include <graph.h>
#include <pgchart.h>
#define MONTHS 12
typedef enum {FALSE, TRUE} boolean;
float far value[MONTHS] =
{
33.0, 27.0, 42.0, 64.0,106.0,157.0,
182.0,217.0,128.0, 62.0, 43.0, 36.0
};
char far *category[MONTHS] =
{
"Jan", "Feb", "Mar", "Apr",
"May", "Jun", "Jly", "Aug",
"Sep", "Oct", "Nov", "Dec"
};
main()
{
chartenv env;
int mode = _VRES16COLOR;
/* Set highest video mode available */
if( _setvideomode( _MAXRESMODE ) == 0 )
exit( 0 );
/* Initialize chart library and a default bar chart */
_pg_initchart();
_pg_defaultchart( &env, _PG_BARCHART, _PG_PLAINBARS );
/* Add titles and some chart options */
strcpy( env.maintitle.title, "Good Neighbor Grocery" );
env.maintitle.titlecolor = 6;
env.maintitle.justify = _PG_RIGHT;
strcpy( env.subtitle.title, "Orange Juice Sales" );
env.subtitle.titlecolor = 6;
env.subtitle.justify = _PG_RIGHT;
strcpy( env.yaxis.axistitle.title, "Months" );
strcpy( env.xaxis.axistitle.title, "Quantity (cases)" );
env.chartwindow.border = FALSE;
/* Parameters for call to _pg_chart are:
* env - Environment variable
* category - Category labels
* value - Data to chart
* MONTHS - Number of data values
*/
if( _pg_chart( &env, category, value, MONTHS ) )
{
_setvideomode( _DEFAULTMODE );
_outtext( "Error: can't draw chart" );
}
else
{
getch();
_setvideomode( _DEFAULTMODE );
}
return( 0 );
}
(This figure may be found in the printed book.)
The grocer's bar chart becomes a column chart in two easy steps. Simply
specify the new chart type when calling _pg_defaultchart and change the axis
titles. To produce a column chart for the grocer's data, replace the call to
_pg_defaultchart with
_pg_defaultchart( &env, _PG_COLUMNCHART, _PG_PLAINBARS );
Replace the last two calls to strcpy with
strcpy( env.xaxis.axistitle.title, "Months" );
strcpy( env.yaxis.axistitle.title, "Quantity (cases)" );
Note that now the x axis is labeled "Months" and the y axis is labeled
"Quantity (cases)." Figure 11.3 shows the resulting column chart.
(This figure may be found in the printed book.)
Creating an equivalent line chart requires only one change. Use the same
code as for the column chart and replace the call to _pg_defaultchart with
_pg_defaultchart( &env, _PG_LINECHART, _PG_POINTANDLINE );
Figure 11.4 shows the line chart for the grocer's data.
(Please refer to the printed book.)
(This figure may be found in the printed book.)
11.3.3 Scatter Diagram
The program SCATTER.C displays a scatter diagram that illustrates the
relationship between the sales of orange juice and hot chocolate throughout
a 12-month period. Figure 11.5 shows the results of SCATTER.C. Notice that
the scatter points form a slightly curved line, indicating that a
correlation exists between the sales of the two products. The demand for
orange juice is roughly inverse to the demand for hot chocolate.
/* SCATTER.C: Create sample scatter diagram. */
#include <conio.h>
#include <string.h>
#include <graph.h>
#include <pgchart.h>
#define MONTHS 12
typedef enum {FALSE, TRUE} boolean;
/* Orange juice sales */
float far xvalue[MONTHS] =
{
33.0, 27.0, 42.0, 64.0,106.0,157.0,
182.0,217.0,128.0, 62.0, 43.0, 36.0
};
/* Hot chocolate sales */
float far yvalue[MONTHS] =
{
37.0, 37.0, 30.0, 19.0, 10.0, 5.0,
2.0, 1.0, 7.0, 15.0, 28.0, 39.0
};
main()
{
chartenv env;
int mode = _VRES16COLOR;
/* Set highest video mode available */
if( _setvideomode( _MAXRESMODE ) == 0 )
exit( 0 );
/* Initialize chart library and default
* scatter diagram
*/
_pg_initchart();
_pg_defaultchart( &env, _PG_SCATTERCHART,
_PG_POINTONLY );
/* Add titles and some chart options */
strcpy( env.maintitle.title, "Good Neighbor Grocery" );
env.maintitle.titlecolor = 6;
env.maintitle.justify = _PG_RIGHT;
strcpy( env.subtitle.title,
"Orange Juice vs Hot Chocolate" );
env.subtitle.titlecolor = 6;
env.subtitle.justify = _PG_RIGHT;
env.yaxis.grid = TRUE;
strcpy( env.xaxis.axistitle.title,
"Orange Juice Sales" );
strcpy( env.yaxis.axistitle.title,
"Hot Chocolate Sales" );
env.chartwindow.border = FALSE;
/* Parameters for call to _pg_chartscatter are:
* env - Environment variable
* xvalue - X-axis data
* yvalue - Y-axis data
* MONTHS - Number of data values
*/
if( _pg_chartscatter( &env, xvalue,
yvalue, MONTHS ) )
{
_setvideomode( _DEFAULTMODE );
_outtext( "Error: can't draw chart" );
}
else
{
getch();
_setvideomode( _DEFAULTMODE );
}
return( 0 );
}
(This figure may be found in the printed book.)
11.4 Manipulating Colors and Patterns
Presentation graphics displays each data series in a way that makes it
discernible from other series. It does this by defining a separate "palette"
for every data series in a chart. Palettes consist of entries that determine
color, line style, fill pattern, and point character used to graph the
series.
Presentation graphics maintains its palettes as an array of structures. The
header file PGCHART.H defines the palette structures as shown below:
/* Typedef for pattern bitmap */
typedef unsigned char fillmap[8];
/* Typedef for palette entry definition */
typedef struct
{
unsigned short color;
unsigned short style;
fillmap fill;
char plotchar;
} paletteentry;
/* Typedef for palette definition */
typedef paletteentry palettetype[_PG_PALETTELEN];
Do not confuse the presentation graphics palettes with the adapter display
palettes, which are register values kept by the video controller. The
function _selectpalette described in Chapter 10, "Communicating with
Graphics," sets the display palette. It does not define the data series
palettes used by presentation graphics.
11.4.1 Color Pool
The color pool determines the colors of graphic elements (axes, labels,
legends, titles).
Presentation graphics organizes all chart colors into a "color pool." The
color pool holds the color index values valid for the current graphics mode.
(Refer to Chapter 10, "Communicating with Graphics," for more information
about the color index.) Palette structures contain color codes that refer to
the color pool. A palette's color index determines the colors used to graph
the data series associated with the palette. The colors of labels, titles,
legends, and axes are determined by the contents of the color pool.
The first element of the color pool is always 0, which is the color index
for the screen background color. The second element is always the highest
color index available for the graphics mode. The remaining elements repeat
the sequences of available pixel values, beginning with 1.
As shown in the example in Section 11.4, the first member of a palette data
structure is
unsigned short color;
This member defines the color index for the data series associated with the
palette.
An example should make this clearer. A graphics mode of _MRES4COLOR (320 by
200 pixels) provides four colors for display. Color index values from 0 to 3
determine the possible colors─say, black, green, red, and brown,
respectively. The first eight elements of this color pool are shown below.
╓┌─────────────────┌────────────┌────────────────────────────────────────────╖
Color Pool Index Color Index Color
────────────────────────────────────────────────────────────────────────────
0 0 Black
1 3 Brown
Color Pool Index Color Index Color
────────────────────────────────────────────────────────────────────────────
1 3 Brown
2 1 Green
3 2 Red
4 3 Brown
5 1 Green
6 2 Red
7 3 Brown
────────────────────────────────────────────────────────────────────────────
Notice that the sequence of available foreground colors repeats from the
third element. The first data series in this case would be plotted in brown,
the second series in green, the third series in red, the fourth series again
in brown, and so forth.
Video adapters such as the EGA or the Hercules(R) InColor(tm) Card allow 16
on-screen colors. This allows presentation graphics to graph more series
without duplicating colors.
11.4.2 Style Pool
Presentation graphics matches the color pool with a collection of different
line styles called the "style pool." Entries in the style pool define the
appearance of lines such as axes and grids. Lines can be solid, dotted,
dashed, or some combination of styles.
The second member of a palette structure defines a style code as
unsigned short style;
Each palette contains a style code that refers to an entry in the style pool
in the same way that it contains a color code that refers to an entry in the
color pool. The style code value in a palette is applicable only to line
graphs and lined scatter diagrams. The style code determines the appearance
of the lines drawn between points.
Use the different line styles in the style pool to differentiate series.
The palette's style code adds further variety to the lines of a multiseries
graph. It is most useful when the number of lines in a chart exceeds the
number of available colors. For example, a graph of nine different data
series must repeat colors if only three foreground colors are available for
the display. However, the style code for each color repetition will be
different, ensuring that none of the lines looks the same.
11.4.3 Pattern Pool
Presentation graphics also maintains a pool of "fill patterns" that
determine the fill design for column, bar, and pie charts. The third member
of the palette structure holds the fill pattern. The pattern member is an
array:
fillmap fill;
where fillmap is type-defined as
typedef unsigned char fillmap[8];
Each fill pattern array holds an 8-by-8 bit map that defines the fill
pattern for the data series associated with the palette. Table 11.3 shows
how a fill pattern of diagonal stripes is created with the fill pattern
array.
The bit map in Table 11.3 corresponds to screen pixels. Each of the eight
layers of the map is a binary number, where a solid circle signifies 1 and
an open circle signifies 0. Thus the first layer of the map─that is, the
first byte─represents the binary number 10011001, which is the decimal
number 153.
Table 11.3 Fill Patterns
╓┌───────────────────────────────────┌───────────────────────────────────────╖
Bit Map Value in Fill
────────────────────────────────────────────────────────────────────────────
☼ ∙ ∙ ☼ ☼ ∙ ∙ ☼ fill[0] = 153
☼ ☼ ∙ ∙ ☼ ☼ ∙ ∙ fill[1] = 204
∙ ☼ ☼ ∙ ∙ ☼ ☼ ∙ fill[2] = 102
∙ ∙ ☼ ☼ ∙ ∙ ☼ ☼ fill[3] = 51
☼ ∙ ∙ ☼ ☼ ∙ ∙ ☼ fill[4] = 153
☼ ☼ ∙ ∙ ☼ ☼ ∙ ∙ fill[5] = 204
Bit Map Value in Fill
────────────────────────────────────────────────────────────────────────────
☼ ☼ ∙ ∙ ☼ ☼ ∙ ∙ fill[5] = 204
∙ ☼ ☼ ∙ ∙ ☼ ☼ ∙ fill[6] = 102
∙ ∙ ☼ ☼ ∙ ∙ ☼ ☼ fill[7] = 51
────────────────────────────────────────────────────────────────────────────
For example, if you want to create the pattern in Table 11.3 for your
chart's first data series, you must reset the fill array for the first
palette structure. You can do this in five steps:
1. Declare a structure of type palettetype to hold the palette
parameters.
2. Call _pg_initchart to initialize the palettes with default values.
3. Call the presentation graphics function _pg_getpalette to retrieve a
copy of the current palette data.
4. Assign the values given in Table 11.3 to the array fill for the
first palette.
5. Call the presentation graphics function _pg_setpalette to load the
modified palette values.
The following lines of code demonstrate these five steps:
/* Declare a structure array for palette data. */
palettetype palette_struct;
.
.
.
/* Initialize chart library */
_pg_initchart();
.
.
.
/* Copy current palette data into palette_struct */
_pg_getpalette( palette_struct );
/* Reinitialize fill pattern for first palette using
values in Table .3 */
palette_struct[1].fill[0] = 153;
palette_struct[1].fill[1] = 204;
palette_struct[1].fill[2] = 102;
palette_struct[1].fill[3] = 51;
palette_struct[1].fill[4] = 153;
palette_struct[1].fill[5] = 204;
palette_struct[1].fill[6] = 102;
palette_struct[1].fill[7] = 51;
/* Load new palette data */
_pg_setpalette( palette_struct );
Now when you display your bar or column chart, the first series appears
filled with the striped pattern shown in Table 11.3.
Palette structures are used differently with pie charts. Instead of
clarifying multiple series, fill patterns, line styles, and colors, palette
structures are used to distinguish individual slices in a pie chart.
Palettes are recycled if the number of slices exceeds _PG_PALETTELEN. Thus,
the first palette dictates not only the appearance of the first slice, but
of slice number _PG_PALETTELEN as well. The second palette determines the
appearance of both the second slice and of slice number _PG_PALETTELEN + 1,
and so forth.
11.4.4 Character Pool
The last member of a palette structure is an index number in a pool of ASCII
characters:
char plotchar;
The member plotchar represents plot points on line graphs and scatter
diagrams. Each palette uses a different character to distinguish plot points
between data series.
11.5 Customizing the Chart Environment
The presentation graphics functions are designed to be flexible. You can use
the system of default values to produce professional-looking charts with a
minimum of programming effort. Or you can fine-tune the appearance of your
charts by overriding default values and initializing variables explicitly in
your program.
The header file PGCHART.H defines a structure type chartenv, which organizes
the chart environment variables. The chart environment describes everything
about a chart except the plots themselves. It is the blank page, in other
words, ready for plotting data. The environment determines the appearance of
text, axes, grid lines, and legends.
Colors and line styles in the chart environment are taken from palettes. In
this way, the appearance of titles and axis lines matches the colors and
line styles of plotted data series.
You can reset any variable in the environment.
Calling the _pg_defaultchart function fills the chart environment with
default values. Presentation graphics allows you to reset any variable in
the environment before displaying a chart. Except for adjusting the palette
values, all initialization of data is done through a chartenv type
structure.
The sample chart programs provided in Section 11.3, "Writing a Presentation
Graphics Program," illustrate how to adjust variables in the chart
environment. These programs create a structure env of type chartenv. The
structure env contains the chart environment variables, initialized by the
call to the _pg_defaultchart function. Environment variables such as the
chart title are then given specific values, as in
strcpy( env.maintitle.title, "Good Neighbor Grocery" );
Environment variables that determine colors and line styles deserve special
mention. The chart environment holds several such variables, which can be
recognized by their names. For example, the variable titlecolor specifies
the color of title text. Similarly, the variable gridstyle specifies the
line style used to draw the chart grid.
These variables are index numbers, but do not refer directly to the color
pool or line pool. They correspond instead to palette numbers. If you set
titlecolor to 2, presentation graphics uses the color code in the second
palette to determine the title's color. Thus, the title in this case would
be the same color as the chart's second data series. If you change the color
code in the palette, you'll also change the title's color.
A structure of type chartenv consists of four types of secondary structures.
The file PGCHART.H type-defines these secondary structures: titletype,
axistype, windowtype, and legendtype.
The remainder of this section describes the chart environment of
presentation graphics. It first examines structures of the four secondary
structures that make up the chart environment structure. The section
concludes with a description of the chartenv structure type. Each section
begins with a brief explanation of the structure's purpose, followed by a
listing of the structure type definition as it appears in the PGCHART.H
file. All symbolic constants are defined in the file PGCHART.H.
11.5.1 titletype Structures
Structures of type titletype determine text, color, and placement of titles
appearing in the graph. The PGCHART.H file defines the structure type as
typedef struct
{
char title[_PG_TITLELEN]; /* Title text */
short titlecolor; /* Palette color
for title text */
short justify; /* _PG_LEFT, _PG_CENTER,
_PG_RIGHT */
} titletype;
The following list describes titletype members:
Member Variable Description
────────────────────────────────────────────────────────────────────────────
justify An integer specifying how the title is
justified within the chart window. The
symbolic constants defined
in PGCHART.H for this variable are
_PG_LEFT,
_PG_CENTER, and _PG_RIGHT.
titlecolor An integer between 1 and _PG_PALETTELEN
that specifies a title's color. The
default value for
titlecolor is 1.
title[_PG_TITLELEN] A character array containing title text.
For example, if env is a structure of
type chartenv, then env.maintitle.title
holds the character string used for the
main title of the chart. Similarly,
env.xaxis.axistitle.title contains the
x axis title. The number of characters
in a title must be one less than
_PG_TITLELEN to allow room for a null
terminator.
11.5.2 axistype Structures
Structures of type axistype contain variables for the axes such as color,
scale, grid style, and tick marks. The PGCHART.H file defines the structure
type as the following:
typedef struct
{
short grid; /* TRUE=grid lines drawn;
FALSE=no lines */
short gridstyle; /* Style bytes for grid */
titletype axistitle; /* Title definition
for axis */
short axiscolor; /* Color for axis */
short labeled; /* TRUE=ticks marks and titles
drawn */
short rangetype; /* _PG_LINEARAXIS,
_PG_LOGAXIS */
float logbase; /* Base used if log axis */
short autoscale; /* TRUE=next 7 values
calculated by system */
float scalemin; /* Minimum value of scale */
float scalemax; /* Maximum value of scale */
float scalefactor; /* Scale factor for data on
this axis */
titletype scaletitle; /* Title definition for
scaling factor */
float ticinterval; /* Distance between tick marks
(world coord.) */
short ticformat; /* _PG_EXPFORMAT or
_PG_DECFORMAT */
short ticdecimals; /* Number of decimals for tick
labels (max=9) */
} axistype;
The following list describes axistype member variables:
Member Variable Description
────────────────────────────────────────────────────────────────────────────
autoscale A Boolean variable. If autoscale is set
to TRUE,
presentation graphics automatically
determines
values for scalefactor, scalemax,
scalemin,
scaletitle, ticdecimals, ticformat, and
ticinterval
(see below). If autoscale equals FALSE,
these seven variables must be specified
in your program.
axiscolor An integer between 1 and _PG_PALETTELEN
that specifies the color used for the
axis and parallel grid lines. (See
description for gridstyle below.) Note
that this member does not determine the
color of
the axis title. That selection is made
through the
axistitle structure.
axistitle A titletype structure that defines the
title of the associated axis. The title
of the y axis displays vertically to the
left of the y axis, and the title of the
x axis displays horizontally below the x
axis.
grid A Boolean true/false value that
determines whether grid lines are drawn
for the associated axis. Grid lines span
the data window perpendicular to the
axis.
gridstyle An integer between 1 and _PG_PALETTELEN
that specifies the grid's line style.
Lines can be solid, dashed, dotted, or
some combination. The default value for
gridstyle is 1.
Note that the color of the parallel axis
determines the color of the grid lines.
Thus, the x axis grid is the same color
as the y axis, and the y axis grid is
the same color as the x axis.
labeled A Boolean value that determines whether
tick marks and labels are drawn on the
axis. Axis labels should not be confused
with axis titles. Axis labels are
numbers or descriptions such as "23.2"
or "January" attached to each tick mark.
logbase If rangetype is logarithmic, the logbase
variable determines the log base used to
scale the axis. The default value is 10.
rangetype An integer that determines whether the
scale of the axis is linear or
logarithmic. The variable rangetype
applies only to value data.
Specify a linear scale with
_PG_LINEARAXIS. A linear scale is best
when the difference between axis minimum
and maximum is relatively small. For
example, a linear axis range 0 - 10
results in 10 tick marks evenly spaced
along the axis.
Use _PG_LOGAXIS to specify a logarithmic
rangetype. Logarithmic scales are useful
when
the range is very large or when the data
varies exponentially. Line graphs of
exponentially varying data can be made
straight with a logarithmic
rangetype.
scalefactor All numeric data are scaled by dividing
each
value by scalefactor. For relatively
small values,
scalefactor should be 1, which is the
default. But data with large values
should be scaled by an appropriate
factor. For example, data in the range
2 million - 20 million should be plotted
with
scalemin set to 2, scalemax set to 20,
and
scalefactor set to 1 million.
If autoscale is set to TRUE,
presentation graphics automatically
determines a suitable value for
scalefactor based on the range of data
to be plotted. Presentation graphics
selects only values that are a factor of
1 thousand─that is, values such as 1
thousand, 1 million, or 1 billion. It
then labels the
scaletitle appropriately (see below). If
you desire some other value for scaling,
you must set autoscale to FALSE and set
scalefactor to the desired scaling value.
scalemax Highest value represented by the axis.
scalemin Lowest value represented by the axis.
scaletitle A titletype structure defining a string
of text that
describes the value of scalefactor. If
autoscale is TRUE, presentation graphics
automatically writes a scale description
to scaletitle. If autoscale equals FALSE
and scalefactor is 1, scaletitle.title
should be blank. Otherwise your program
should copy an appropriate scale
description to scaletitle.title, such as
"( x 1000)," "(in millions of units),"
or "times 10 thousand dollars."
For the y axis, the scaletitle text
displays vertically between the axis
title and the y axis. For the x axis,
the scale title appears below the x axis
title.
ticdecimals Number of digits to display after the
decimal point in tick labels. Maximum
value is 9. (This variable applies only
to axes with value data and is ignored
for the category axis.)
ticformat An integer that determines format of the
labels
assigned to each tick mark. Set
ticformat to
_PG_EXPFORMAT for exponential format or
to _PG_DECFORMAT for decimal. The
default is _PG_DECFORMAT. (This variable
applies only to axes with value data and
is ignored for the category axis.)
ticinterval Sets interval between tick marks on the
axis. The tick interval is measured in
the same units as the numeric data
associated with the axis. For example,
if 2 sequential tick marks correspond to
the values 20 and 25, the tick interval
between them is 5. (This variable
applies only to axes with value data and
is ignored for the category axis.)
11.5.3 windowtype Structures
Structures of type windowtype contain sizes, locations, and color codes for
the three windows produced by presentation graphics: the chart window, the
data window, and the legend. Windows are located on the screen relative to
the screen's logical origin. By changing the logical origin, you can display
charts that are partly or completely off the screen.
The PGCHART.H file defines windowtype as the following:
typedef struct
{
short x1; /* Left edge of window in
pixels */
short y1; /* Top edge of window in
pixels */
short x2; /* Right edge of window in
pixels */
short y2; /* Bottom edge of window in
pixels */
short border; /* TRUE for border, FALSE
otherwise */
short background; /* Internal palette color for
window background */
short borderstyle; /* Style bytes for window
border */
short bordercolor; /* Internal palette color for
window border */
} windowtype;
The following list describes windowtype member variables:
Member Variable Description
────────────────────────────────────────────────────────────────────────────
background An integer between 1 and _PG_PALETTELEN
that specifies the window's background
color. The default value for background
is 1.
border A Boolean variable that determines
whether a border frame is drawn around a
window.
bordercolor An integer between 1 and _PG_PALETTELEN
that specifies the color of the window's
border frame. The default value is 1.
borderstyle An integer between 1 and _PG_PALETTELEN
that specifies the line style of the
window's border frame. The default value
is 1.
x1, y1, x2, y2 Window coordinates in pixels. The
ordered pair
(x1, y1) specifies the coordinate of the
upper left corner of the window. The
ordered pair ( x2, y2 ) specifies the
coordinate of the lower right corner.
The reference point for the coordinates
depends on the type of window. The chart
window is located relative to the
logical origin, usually the upper left
corner of the screen. The data and
legend windows are located relative to
the upper left corner of the chart
window. This allows you to change the
position of the chart window without
having to redefine coordinates for the
other two windows.
11.5.4 legendtype Structures
Structures of type legendtype contain size, location, and colors of the
chart legend. The PGCHART.H file defines the structure type as the
following:
typedef struct
{
short legend; /* TRUE=draw legend;
FALSE=no legend */
short place; /* _PG_RIGHT, _PG_BOTTOM,
_PG_OVERLAY */
short textcolor; /* Palette color for text*/
short autosize; /* TRUE=system calculates
legend size */
windowtype legendwindow; /* Window definition for
legend */
} legendtype;
The following list describes legendtype member variables:
Member Variable Description
────────────────────────────────────────────────────────────────────────────
autosize A Boolean true/false variable that
determines whether presentation graphics
is to automatically
calculate the size of the legend. If
autosize equals FALSE, the legend window
must be specified in the legendwindow
structure (see below).
legend A Boolean true/false variable that
determines whether a legend is to appear
on the chart. The legend variable is
ignored by functions that graph
single-series charts.
legendwindow A windowtype structure that defines
coordinates, background color, and
border frame for the legend. Coordinates
given in legendwindow are ignored if
autosize is set to TRUE.
place An integer that specifies the location
of the legend relative to the data
window. Setting place equal
to the constant _PG_RIGHT positions the
legend
to the right of the data window. Setting
place to
_PG_BOTTOM positions the legend below
the data window. Setting place to
_PG_OVERLAY positions the legend within
the data window.
These settings influence the size of the
data window. If place equals _PG_RIGHT
or _PG_BOTTOM, presentation graphics
automatically sizes the data window to
accommodate the legend. If place equals
_PG_OVERLAY, the data window is sized
without regard to the legend.
textcolor An integer between 1 and _PG_PALETTELEN
that specifies the color of text within
the legend window.
11.5.5 chartenv Structures
A structure of type chartenv defines the chart environment. The following
listing shows that a chartenv type structure consists almost entirely of
structures of the four types described above.
The PGCHART.H file defines the chartenv structure type as the following:
typedef struct
{
short charttype; /* Chart type */
short chartstyle; /* Chart style */
windowtype chartwindow; /* Window definition for
overall chart */
windowtype datawindow; /* Window definition for data
part of chart */
titletype maintitle; /* Main chart title */
titletype subtitle; /* Chart subtitle */
axistype xaxis; /* Definition for x axis */
axistype yaxis; /* Definition for y axis */
legendtype legend; /* Definition for legend */
} chartenv;
Initialize the chart environment with the _pg_defaultchart function.
The data in a chartenv type structure is initialized by calling the function
_pg_defaultchart. If your program does not call _pg_defaultchart, it must
explicitly define every variable in the chart environment─a tedious
procedure. The recommended method for adjusting the appearance of your chart
is to initialize variables for the proper chart type by calling the
_pg_defaultchart function, and then to reassign selected environment
variables such as titles.
The following list describes chartenv member variables:
Member Variable Description
────────────────────────────────────────────────────────────────────────────
chartstyle An integer that determines the style of
the chart
(see Table 11.2). Legal values for
chartstyle are _PG_PERCENT and
_PG_NOPERCENT for pie charts;
_PG_PLAINBARS and _PG_STACKEDBARS for
bar and column charts; and _PG_POINTONLY
and _PG_POINTANDLINE for line graphs
and scatter diagrams. This variable
corresponds to the third argument for
the _pg_defaultchart function.
charttype An integer that determines the type of
chart displayed. The value of charttype
is _PG_BARCHART, _PG_COLUMNCHART,
_PG_LINECHART, _PG_SCATTERCHART, or
_PG_PIECHART. This variable corresponds
to the second argument for the
_pg_defaultchart function.
chartwindow A windowtype structure that defines the
appearance of the chart window.
datawindow A windowtype structure that defines the
appearance of the data window.
legend A legendtype structure that defines the
appearance of the legend window.
maintitle A titletype structure that defines the
appearance of the main title of the
chart.
subtitle A titletype structure that defines the
appearance of the chart's subtitle.
xaxis An axistype structure that defines the
appearance of the x axis. (This variable
is not applicable for pie charts.)
yaxis An axistype structure that defines the
appearance of the y axis. (This variable
is not applicable for pie charts.)
Chapter 12 Programming with Mixed Languages
────────────────────────────────────────────────────────────────────────────
There are times when your Microsoft C programs need to call programs written
in other languages or when programs written in other languages need to call
your C functions. This is called mixed-language programming. For example,
when a particular subprogram is available commercially in a language other
than C or when algorithms are described more naturally in a different
language, you need to use more than one language.
This chapter describes the elements of mixed-language programming─how to
make calls from programs written in one language to routines written in
another.
12.1 Making Mixed-Language Calls
Mixed-language programming always involves a call to a function, procedure,
or subroutine. For example, a BASIC main module may need to execute a
specific task that you would like to program separately. Instead of calling
a BASIC subprogram, however, you decide to call a C function.
Mixed-language calls involve calling functions in separate modules. Instead
of compiling all of your source modules with the same compiler, you use
different compilers. In the instance mentioned above, you would compile the
mainmodule source file with the BASIC compiler, another source file (written
in C) with the C compiler, and then link the two object files.
Figure 12.1 illustrates how the syntax of a mixed-language call works, using
the instance mentioned above.
(This figure may be found in the printed book.)
In Figure 12.1, the BASIC call to C is CALL Prn, similar to a call to a
BASIC subprogram. There are two differences between this mixed-language call
and a call between two BASIC modules:
1. The subprogram Prn is implemented in C, using standard C syntax.
2. The implementation of the call in BASIC is affected by the DECLARE
statement, which uses the CDECL keyword to create compatibility with
C. The DECLARE statement (which is described in detail in the
Microsoft BASIC Language Reference and the Microsoft BASIC
Programmer's Guide) is an example of a mixed-language "interface"
statement. These interface statements override default naming and
calling conventions. Each language provides its own form of interface.
You can make mixed-language calls to routines regardless of whether they
have return values. (In this chapter, "routine" refers to any function,
procedure, or subroutine that can be called from another module.)
Table 12.1 shows the correspondence between calls to routines in different
languages.
Table 12.1 Language Equivalents for Routine Calls
╓┌───────────────────┌────────────────────┌──────────────────────────────────╖
Language Return Value No Return Value
────────────────────────────────────────────────────────────────────────────
Assembly Language Procedure Procedure
BASIC FUNCTION procedure Subprogram
C function (void) function
FORTRAN FUNCTION SUBROUTINE
Pascal Function Procedure
────────────────────────────────────────────────────────────────────────────
For example, a C module can make a subprogram call to a FORTRAN subroutine.
You can prototype a FORTRAN subroutine as a function with a void type.
────────────────────────────────────────────────────────────────────────────
NOTE
BASIC DEF FN functions and GOSUB subroutines cannot be called from another
language.
────────────────────────────────────────────────────────────────────────────
12.2 Language Convention Requirements
To mix languages, the calling program must observe the same conventions as
the called program. The conventions described in this section govern the
following:
■ How compilers treat identifiers, including function and variable names
(naming convention)
■ How the subprogram call is implemented (calling convention)
■ How parameters are passed (parameter-passing convention)
12.2.1 Naming Convention Requirement
Both the calling program and the called subprogram must agree on the names
of identifiers. Identifiers can refer to subprograms (functions, procedures,
and subroutines) or to variables that have a public or global scope. Each
language alters the names of identifiers.
The term "naming convention" refers to the way a compiler alters the name of
the routine before placing it in an object file. Languages may alter the
identifier names differently. You can choose between several naming
conventions to ensure that the names in the calling program agree with those
in the called program. If the names of called routines are stored
differently in each object file, the linker will not be able to find a
match. It will instead report unresolved external references.
Microsoft compilers place machine code into object files; they also place
the names of all publicly accessed routines and variables in object files.
The linker can then compare the name of a routine called in one module with
the name of a routine defined in another module, and recognize a match.
Names are stored in the ASCII (American Standard Code for Information
Interchange) character set.
Some languages translate names to uppercase.
BASIC, FORTRAN, and Pascal use similar naming conventions. They translate
each letter to uppercase. BASIC type declaration characters (%, &, !, #, $)
are dropped.
Each language recognizes a different number of characters. FORTRAN
recognizes the first 31 characters of any name (unless identifier names are
truncated), Pascal the first 8, and BASIC the first 40. If a name is longer
than the language will recognize, additional characters are simply not
placed in the object file.
────────────────────────────────────────────────────────────────────────────
NOTE
Versions of Microsoft FORTRAN previous to version 5.0 truncated identifiers
to six characters. As of version 5.0, FORTRAN retains up to 31 characters of
significance unless you use the /4Yt option.
────────────────────────────────────────────────────────────────────────────
C is a case-sensitive language.
The C compiler does not translate any letters to uppercase. It inserts a
leading underscore ( _ ) in front of the name of each routine. C recognizes
the first 31 characters of a name.
Differences in naming conventions are dealt with automatically by
mixedlanguage keywords, as long as you follow two rules:
1. If you use any FORTRAN routines that were compiled with the /4Yt
command-line option or with the $TRUNCATE metacommand enabled, make
all names 6 characters or less. Make all names 6 characters or less
when using FORTRAN routines compiled with versions of the FORTRAN
compiler prior to 5.0.
2. Do not use the /NOIGNORECASE linker option (which causes the linker to
treat identifiers in a case-sensitive manner). With C modules, this
means that you must be careful not to rely upon differences between
uppercase and lowercase letters when programming.
CL automatically uses the /NOIGNORECASE option when linking. To solve
the problems created by this behavior, either link separately with the
LINK utility, or use all lowercase letters in your C function names
and public variables (global variables that are not declared as
static).
────────────────────────────────────────────────────────────────────────────
NOTE
If you use the command-line option /Gc (generate Pascal-style function
calls) when you compile, or if you declare a function or variable with the
_pascal keyword, the compiler will translate your identifiers to uppercase.
────────────────────────────────────────────────────────────────────────────
Figure 12.2 illustrates a complete mixed-language development example,
showing how naming conventions enter into the process.
(This figure may be found in the printed book.)
In Figure 12.2, note that the BASIC compiler inserts a leading underscore in
front of Prn as it places the name into the object file, because the CDECL
keyword directs the BASIC compiler to use the C naming convention. BASIC
will also convert all letters to lowercase when this keyword is used.
(Converting letters to lowercase is not part of the C naming convention;
however, it is consistent with the programming style of many C programs.)
12.2.2 Calling Convention Requirement
The term "calling convention" refers to the way a language implements a
call. The choice of calling convention affects the machine instructions that
a compiler generates to execute (and return from) a function, procedure, or
subroutine call.
It is crucial that the two routines concerned (the routine issuing a call
and the routine being called) use the same protocol. Otherwise, the
processor may receive inconsistent instructions, causing the program to
behave incorrectly.
The use of a calling convention affects programming in three ways:
1. The calling routine uses a calling convention to determine the order
in which to pass arguments (parameters) to another routine. This
convention can be specified in a mixed-language interface statement or
declaration.
2. The called routine uses a calling convention to determine the order in
which to receive the parameters passed to it. In most languages, this
convention can be specified in the routine's heading. BASIC, however,
always uses its own convention to receive parameters.
3. Both the calling routine and the called routine must agree on which of
them is responsible for adjusting the stack after all parameters are
removed.
In other words, each call to a routine uses a certain calling convention;
each routine heading specifies or assumes some calling convention. The two
conventions must be compatible. With all languages except BASIC, it is
possible to change the calling convention at the point of the call or at the
declaration of the called routine. Usually, however, it is easier to adopt
the convention of the called routine. For example, a C function would use
its own convention to call another C function, and would use the Pascal
convention to call Pascal.
BASIC, FORTRAN, and Pascal use the same standard calling convention. C uses
a different convention.
Effects of Calling Conventions
Calling conventions dictate three things:
1. The way parameters are communicated from one routine to another (in
Microsoft mixed-language programming, parameters or pointers to the
parameters are passed on the stack)
2. The order in which parameters are passed from one routine to another
3. The part of the program responsible for adjusting the stack
Some languages pass parameters in a different order than C.
The BASIC, FORTRAN and Pascal calling conventions push parameters onto the
stack in the order in which they appear in the source code. For example, the
BASIC statement
CALL Calc( A, B )
pushes argument A onto the stack before it pushes B. These conventions
also specify that the stack is adjusted by the called routine just before
returning control to the caller.
The C calling convention pushes parameters onto the stack in the reverse
order from their appearance in the source code. For example, the C function
call
calc( a, b );
pushes b onto the stack before it pushes a. In contrast with the other
high-level languages, the C calling convention specifies that a calling
routine always adjusts the stack immediately after the called routine
returns control.
The BASIC, FORTRAN, and Pascal conventions produce slightly less object
code. However, the C convention makes calling with a variable number of
parameters possible. (Because the first parameter is always the last one
pushed, it is always on the top of the stack; therefore it has the same
address relative to the frame pointer, regardless of how many parameters
were actually passed.)
────────────────────────────────────────────────────────────────────────────
NOTE
The _fastcall keyword, which specifies that parameters are to be passed in
registers, is incompatible with programs written in other languages. Avoid
using _fastcall or the /Gr command-line option for C functions that you
intend to make public to BASIC, FORTRAN, or Pascal programs.
────────────────────────────────────────────────────────────────────────────
12.2.3 Parameter-Passing Requirement
Your programs must agree on the calling convention and the naming
convention; they must also agree on the order in which they pass parameters.
It is important that your routines send parameters in the same way to ensure
proper data transmission and correct program results.
Microsoft compilers support three methods for passing a parameter:
Method Description
────────────────────────────────────────────────────────────────────────────
Near reference Passes a variable's near (offset)
address. This address is expressed as an
offset from the default data segment.
This method gives the called routine
direct access to the variable itself.
Any change the routine makes to the
parameter changes the variable in the
calling routine.
Far reference Passes a variable's far (segmented)
address.
This method is similar to passing by
near reference, except that a longer
address is passed. This method is slower
than passing by near reference, but is
necessary when you pass data that is
outside the default data segment. (This
is an issue in BASIC or Pascal only if
you have specifically requested far
memory.)
Value Passes only the variable's value, not
its address.
With this method, the called routine
knows the value of the parameter but has
no access to the original variable.
Changes to a value passed by a parameter
have no affect on the value of the
parameter in the calling routine.
These different parameter-passing methods mean that you must consider the
following when programming with mixed languages:
■ You need to make sure that the called routine and the calling routine
use the same method for passing each parameter (argument). In most
cases, you will need to check the parameter-passing defaults used by
each language and possibly make adjustments. Each language has
keywords or language features that allow you to change
parameter-passing methods.
■ You may want to choose a specific parameter-passing method rather than
using the defaults of any language.
Table 12.2 summarizes the parameter-passing defaults for each language.
Table 12.2 Parameter-Passing Defaults
╓┌─────────┌─────────────────────┌─────────────────────┌─────────────────────╖
Language Near Reference Far Reference By Value
Language Near Reference Far Reference By Value
────────────────────────────────────────────────────────────────────────────
BASIC All --- ---
C Near arrays Far arrays All data except
arrays
FORTRAN All (medium model) All (large model) With attributes(1)
Pascal VAR, CONST VARS, CONSTS Other parameters
────────────────────────────────────────────────────────────────────────────
(1) When a PASCAL or C attribute is applied to a FORTRAN routine, passing
by value becomes the default.
12.3 Compiling and Linking
After you have written your source files and decided on a naming convention,
a calling convention, and a parameter-passing convention, you are ready to
compile and link individual modules.
12.3.1 Compiling with Correct Memory Models
With BASIC, FORTRAN, and Pascal, no special options are required to compile
source files that are part of a mixed-language program.
With C, not all memory models are compatible with other languages.
BASIC, FORTRAN, and Pascal use only far (segmented) code addresses.
Therefore, you must use one of two techniques with C programs that call one
of these languages: compile C modules in medium, large, or huge model (using
the /AX command-line options), because these models also use far code
addresses; or apply the _far keyword to the definitions of C functions you
make public. If you use the /AX command-line option to specify medium,
large, or huge model, all your function calls become far by default. This
means you don't have to declare your functions explicitly with the _far
keyword.
Choice of memory model affects the default data pointer size in C and
FORTRAN, although this default can be overridden with the _near and _far
keywords. With C and FORTRAN, choice of memory model also affects whether
data objects are located in the default data segment; if a data object is
not located in the default data segment, it cannot be passed by near
reference.
For more information about code and data address sizes in C, refer to
Chapter 2, "Managing Memory."
12.3.2 Linking with Language Libraries
In most cases, you can easily link modules compiled with different
languages. Do any of the following to ensure that all required libraries
link in the correct order:
■ Put all language libraries in the same directory as the source files.
■ List directories containing all needed libraries in the LIB
environment variable.
■ Let the linker prompt you for libraries.
In each of the cases above, the linker finds libraries in the order that it
requires them. If you enter the library names on the command line, make sure
you enter them in an order that allows the linker to resolve your program's
external references. Here are some points to observe when specifying
libraries on the command line:
■ If you are using FORTRAN to write one of your modules, you need to
link with the /NOD (no default libraries) option and explicitly
specify all the libraries you need on the link command line. You can
also specify these libraries with an automatic-response file (or batch
file), but you cannot use a default-library search.
■ If your program uses both FORTRAN and C, specify the library for the
most recent of the two language products first. In addition, make sure
that you choose a C-compatible library when you install FORTRAN.
■ If you are listing BASIC libraries on the LINK command line, specify
those libraries first.
The following example shows how to link two modules, mod1 and mod2, with
a user library, GRAFX, the C run-time library, LLIBCE, and the FORTRAN
run-time library, LLIBFORE:
LINK /NOD mod1 mod2,,,GRAFX+LLIBCE+LLIBFORE
12.4 C Calls to High-Level Languages
Just as you can call Microsoft C routines from other Microsoft languages,
you can call routines written in Microsoft FORTRAN and Pascal from C. With
FORTRAN, Pascal, and C, freestanding routines can be written with no
restriction. When calling BASIC routines, however, you must write the main
program in BASIC; any subprograms are free to call one another, whether they
are written in C or BASIC.
For information about how to pass particular kinds of data, see Section
12.9, "Handling Data in Mixed-Language Programming."
Executing a Mixed-Language Call
The C interface to other languages uses standard C prototypes, with the
_fortran or _pascal keyword. Using either of these keywords causes the
routine to be called with the FORTRAN/Pascal naming and calling convention.
(The FORTRAN/Pascal convention also works for BASIC.) Here are the
recommended steps for executing a mixed-language call from C:
1. Write a prototype for each mixed-language routine called. The
prototype should declare the routine extern for the purpose of program
documentation.
Instead of using the _fortran or _pascal keyword, you can simply
compile with the Pascal calling convention option (/Gc). The /Gc
option causes all functions in the module to use the FORTRAN/Pascal
naming and calling conventions, except where you apply the _cdecl
keyword.
2. Pass the values of variables or pointers to variables. You can obtain
a pointer to a variable with the address-of (&) operator.
In C, array names are always passed as pointers to the first element
of the array; they are always passed by reference.
The prototype you declare for your function ensures that you are
passing the correct length address (that is, near or far).
3. Issue a function call in your program as though you were calling a C
function.
4. Always compile the C module in either medium, large, or huge model, or
use the _far keyword in your function prototype. This ensures that a
far (intersegment) call is made to the routine.
Using the _fortran or _pascal Keyword
There are two rules of syntax that apply when you use the _fortran or
_pascal keyword:
1. The _fortran and _pascal keywords modify only the item immediately to
their right.
2. The _near and _far keywords can be used with the _fortran and _pascal
keywords in prototypes. The sequences _fortran _far and _far _fortran
are equivalent.
The keywords _pascal and _fortran have the same effect on the program; using
one or the other makes no difference except for internal program
documentation. Use _fortran to declare a FORTRAN routine, _pascal to declare
a Pascal rou-tine, and either keyword to declare a BASIC routine.
The following examples demonstrate the syntax rules presented above.
The example below declares func to be a BASIC, Pascal, or FORTRAN function
taking two short parameters and returning a short value.
short _pascal func( short sarg1, short sarg2 );
The example below declares func to be pointer to a BASIC, Pascal, or FORTRAN
routine that takes a long parameter and returns no value. The keyword void
is appropriate when the called routine is a BASIC subprogram, Pascal
procedure, or FORTRAN subroutine, since it indicates that the function
returns no value.
void ( _fortran * func )( long larg );
The example below declares func to be a _near BASIC, Pascal, or FORTRAN
routine. The routine receives a double parameter by reference (because it
expects a pointer to a double) and returns a short value.
short _near _pascal func( _near double * darg );
The example below is equivalent to the preceding example ( _pascal _near is
equivalent to _near _pascal).
short _pascal _near func( _near double * darg );
You can make C adopt the conventions of other languages.
When you call a BASIC subprogram, you must use the FORTRAN/Pascal
conventions to make the call. When you call FORTRAN or Pascal, however, you
have a choice. You can make C adopt the conventions described in the
previous section, or you can make the FORTRAN or Pascal routine adopt the C
conventions.
To make a FORTRAN or Pascal routine adopt the C conventions, put the C
attribute in the heading of the routine's definition. The following example
shows the syntax for the C attribute in a FORTRAN subroutine-definition
heading:
SUBROUTINE FFROMC [C] (N)
INTEGER*2 N
The following example shows the syntax for the C attribute in a Pascal
procedure-definition heading:
PROCEDURE Pfromc( n : INTEGER ) [C];
To make a C function adopt the FORTRAN/Pascal conventions, declare the
function as _fortran or _pascal. For example,
void _pascal CfromP( int n );
12.5 C Calls to BASIC
No BASIC routine can be executed unless the main program is in BASIC,
because a BASIC routine requires the environment to be initialized in a way
that is unique to BASIC. No other language will perform this special
initialization.
However, your program can start up in BASIC, call a C function that does
most of the work of the program, and then call BASIC subprograms and
function procedures as needed. Figure 12.3 illustrates how to do this.
(This figure may be found in the printed book.)
Follow these rules when you call BASIC from C:
1. Start up in a BASIC main module. You will need to use the DECLARE
statement to provide an interface to the C module.
2. In the C module, write a prototype for the BASIC routine and include
type information for parameters. Use either the _fortran or _pascal
keyword to modify the routine itself.
3. Make sure that all data are passed as near pointers. BASIC can pass
data in a variety of ways but is unable to receive data in any form
other than near reference. With near pointers, the program assumes
that the data are in the default data segment. If you want to pass
data that are not in the default data segment, copy the data to a
variable in the default data segment.
4. Compile the C module in medium or large model to ensure far
(intersegment) calls.
The example below demonstrates a BASIC program that calls a C function. The
C function then calls a BASIC function that returns twice the number passed
to it and a BASIC subprogram that prints two numbers.
' BASIC source
'
' The main program is in BASIC because of BASIC's start-up
' requirements. The BASIC main program calls the C function
' Cprog.
'
' Cprog calls the BASIC subroutine Dbl.
'
DEFINT A-Z
DECLARE SUB Cprog CDECL()
CALL Cprog
END
'
FUNCTION Dbl(N) STATIC
Dbl = N*2
END FUNCTION
'
SUB Printnum(A,B) STATIC
PRINT "The first number is ";A
PRINT "The second number is ";B
END SUB
/* C source; compile in medium or large model */
int _fortran dbl( int _near * N );
void _fortran printnum( int _near * A, int _near * B );
void cprog()
{
int a = 5;
int b = 6;
printf( "%d times 2 is %d\n", a, dbl( &a ) );
printnum( &a, &b );
}
In the previous example, note that the addresses of a and b are passed,
since BASIC expects to receive addresses for parameters. This is important
because C passes parameters by value unless you use the address-of (&)
operator to obtain the address, or are passing an array. Also note that the
function prototype for printnum declares the parameters as near pointers.
The prototype causes the
variables to be passed by near reference. If a or b is declared as _far,
the C compiler issues a warning that you are converting a far pointer to a
near pointer and that a segment was lost in the conversion.
Calling and naming conventions are resolved by the CDECL keyword in the
BASIC declaration of Cprog, and by the _fortran keyword in the C declaration
of dbl and printnum.
BASIC can invoke one of your functions as part of the termination
procedure.
Versions of QuickBASIC later than 4.0 provide a "user entry point,"
B_OnExit, which can be called directly from C. The B_OnExit function enables
you to make sure you have performed an orderly termination. The following
code shows how to use B_OnExit.
#include <malloc.h> /* For declaration of _fmalloc */
#include <stdlib.h> /* For declaration of onexit_t */
/* The prototype for B_OnExit declares it as a function
* returning type onexit_t that takes one parameter. The
* parameter is a far pointer to a function that returns
* no value.
*/
extern onexit_t _pascal _far B_OnExit( onexit_t );
void TermProc( void );
int * p_IntArray;
void InitProc( void )
{
/* Allocate far space for 20-integer array */
p_IntArray = (int *)_fmalloc( 20 * sizeof( int ) );
/* Log termination routine (TermProc) with BASIC. */
B_OnExit( TermProc );
}
void TermProc( void )
{
free( p_IntArray ); /* Release far space allocated */
} /* previously by InitProc. */
12.6 C Calls to FORTRAN
This section shows two examples of C-FORTRAN programs. There are two types
of subprogram calls to FORTRAN routines: calls to subroutines and calls to
functions. Functions return a value, while subroutines do not. The examples
in the next sections illustrate how to handle the difference between
function and subroutine calls.
12.6.1 Calling a FORTRAN Subroutine from C
The example below demonstrates a C main module calling a FORTRAN subroutine,
MAXPARAM. This subroutine adjusts the lower of two arguments to be equal to
the higher argument.
/* C source file - calls FORTRAN subroutine
* Compile in medium or large model
*/
extern void _fortran maxparam( int _near * I, int _near * J );
/* Declare as void, because there is no return value.
* FORTRAN keyword causes C to use FORTRAN/Pascal
* calling and naming conventions.
* Two integer parameters, passed by near reference.
*/
main()
{
int a = 5;
int b = 7;
printf( "a = %d, b = %d", a, b );
maxparam( &a, &b );
printf( "a = %d, b = %d", a, b );
}
C FORTRAN source file, subroutine MAXPARAM
C
$NOTRUNCATE
SUBROUTINE MAXPARAM (I, J)
INTEGER*2 I [NEAR]
INTEGER*2 J [NEAR]
C
C I and J received by near reference,
C because of NEAR attribute
C
IF (I .GT. J) THEN
J = I
ELSE
I = J
ENDIF
END
In the previous example, the C program adopts the naming convention and
call-ing convention of the FORTRAN subroutine. The two programs must agree
on whether parameters are to be passed by reference or by value. The
following keywords affect how the two programs interface:
■ The _fortran keyword directs C to call maxparam with the FORTRAN/
Pascal naming convention (as MAXPARAM); _fortran also directs C to
call maxparam with the FORTRAN/Pascal calling convention.
■ Since the FORTRAN subroutine MAXPARAM may alter the value of either
parameter, both parameters must be passed by reference. In this case,
near reference was chosen; this method is specified in C by the use of
near pointers, and in FORTRAN by applying the NEAR keyword to the
parameter declarations.
Far reference could have been specified by using far pointers in C. In
that case, you would not declare the FORTRAN subroutine MAXPARAM
with the NEAR keyword. If you compile the FORTRAN program in medium
model, declare MAXPARAM using the FAR keyword.
12.6.2 Calling a FORTRAN Function from C
The example below demonstrates a C main module calling the FORTRAN function
fact. This function returns the factorial of an integer value.
/* C source file - calls FORTRAN function.
* Compile in medium or large model.
*/
int _fortran fact( int N );
/* FORTRAN keyword causes C to use FORTRAN/Pascal
* calling and naming conventions.
* Integer parameter passed by value.
*/
main()
{
int x = 3;
int y = 4;
printf( "The factorial of x is %4d", fact( x ) );
printf( "The factorial of y is %4d", fact( y ) );
printf( "The factorial of x+y is %4d", fact( x + y ) );
}
C FORTRAN source file - factorial function
C
$NOTRUNCATE
INTEGER*2 FUNCTION FACT (N)
INTEGER*2 N [VALUE]
C
C N is received by value, because of VALUE attribute
C
INTEGER*2 I
FACT = 1
DO 100 I = 1, N
FACT = FACT * I
100 CONTINUE
RETURN
END
In the example above, the C program adopts the naming convention and calling
convention of the FORTRAN subroutine. Both programs must agree on whether
parameters are passed by reference or by value. Note that the C program
passes the parameters by value rather than by reference. Passing parameters
by value is the default for C. To accept parameters passed by value, the
keyword VALUE is used in the declaration of N in the FORTRAN function. The
_fortran keyword directs C to call fact with the FORTRAN/Pascal naming
convention (as FACT); _fortran also directs C to call fact with the
FORTRAN/Pascal calling convention.
When passing a parameter that should not be changed, pass the parameter by
value. Passing by value is the default method in C and is specified in
FORTRAN by applying the VALUE attribute to the parameter declaration.
12.7 C Calls to Pascal
This section shows two examples of C-Pascal programs. There are two types of
subprogram calls to Pascal routines: calls to procedures and calls to
functions. Functions return a value, while procedures do not. The examples
in the next sections illustrate how to handle the difference between
function and procedure calls.
12.7.1 Calling a Pascal Procedure from C
The following example demonstrates a C main module calling a Pascal
procedure, maxparam. This procedure adjusts the lower of two arguments to
be equal to the higher argument.
/* C source file - calls Pascal procedure.
* Compile in medium or large model.
*/
void _pascal maxparam( int _near * a, int _near * b );
/* Declare as void, because there is no return value.
* The _pascal keyword causes C to use FORTRAN/Pascal
* calling and naming conventions.
* Two integer params, passed by near reference.
*/
main()
{
int a = 5;
int b = 7;
printf( "a = %d, b = %d", a, b );
maxparam( &a, &b );
printf( "a = %d, b = %d", a, b );
}
{ Pascal source code - Maxparam procedure. }
MODULE Psub;
PROCEDURE Maxparam( VAR a:INTEGER; VAR b:INTEGER );
{ Two integer parameters are received by near reference. }
{ Near reference is specified with the VAR keyword. }
BEGIN
if a > b THEN
b := a
ELSE
a := b
END;
END.
In the example above, the C program adopts the Pascal naming convention and
calling convention. Both programs must agree on whether parameters are
passed by reference or by value; the following keywords affect the
conventions:
■ The _pascal keyword directs C to call Maxparam with the FORTRAN/
Pascal naming convention (as MAXPARAM); _pascal also directs C to
call Maxparam with the FORTRAN/Pascal calling convention.
■ Since the procedure Maxparam can alter the value of either
parameter, both parameters must be passed by reference. In this case,
near reference is used; this method is specified in C by the use of
near pointers, and in Pascal with the VAR keyword.
Far reference could have been specified by using far pointers in C. To
specify far reference in Pascal, use the VARS keyword instead of VAR.
12.7.2 Calling a Pascal Function from C
The example below demonstrates a C main module calling Pascal function
fact. This function returns the factorial of an integer value.
/* C source file - calls Pascal function.
* Compile in medium or large model.
*/
int _pascal fact(int n);
/* PASCAL keyword causes C to use FORTRAN/Pascal
* calling and naming conventions.
* Integer parameter passed by value.
*/
main()
{
int x = 3;
int y = 4;
printf( "The factorial of x is %4d", fact( x ) );
printf( "The factorial of y is %4d", fact( y ) );
printf( "The factorial of x+y is %4d", fact( x + y ) );
}
{ Pascal source code - factorial function. }
MODULE Pfun;
FUNCTION Fact (n : INTEGER) : INTEGER;
{Integer parameters received by value, the Pascal default. }
BEGIN
Fact := 1;
WHILE n > 0 DO
BEGIN
Fact := Fact * n;
n := n - 1; {Parameter n modified.}
END;
END;
END.
In the example above, the C program adopts the Pascal naming convention and
calling convention. Both programs must agree on whether parameters are
passed by reference or by value. The _pascal keyword directs C to call fact
with the FORTRAN/Pascal naming convention (as FACT); _pascal also directs
C to call fact with the FORTRAN/Pascal calling convention.
The Pascal function fact should receive a parameter by value. Otherwise,
the Pascal function will corrupt the parameter's value in the calling
module. Passing by value is the default method for both C and Pascal.
12.8 C Calls to Assembly Language
In Microsoft C, Version 6.0, you can write assembly-language programs either
by using the in-line assembler or by creating a stand-alone module using the
Microsoft Macro Assembler (MASM). If you use the in-line assembler, you do
not need to take any special precautions other than those outlined in
Chapter 3, "Using the In-Line Assembler." This section explains the
techniques for interfacing your assembly-language routines with your C
program.
When deciding whether to use the in-line assembler or MASM, there are
several considerations. Here is a list of advantages MASM provides over the
in-line assembler:
■ MASM supports declaration of data in MASM format; in-line assembly
does not.
■ MASM has a more powerful macro capability than in-line assembly.
■ Modules written for MASM can be interfaced more easily with modules
written in more than one Microsoft high-level language.
■ MASM assembles large assembly-language programs more quickly than the
in-line assembler.
■ MASM supports assembly-language code written prior to the existence of
the in-line assembler.
■ MASM error messages and warnings are more complete than those of the
in-line assembler.
The in-line assembler is far more efficient for some assembly-language
programming tasks. Here are some of the benefits of the in-line assembler:
■ You can do spot optimizations by including short sections of
assemblylanguage code in your C programs with the in-line assembler.
■ Code written in in-line assembler does not necessarily incur the
overhead of a function call; code assembled using MASM always does.
■ You can include in-line assembly code in your C source files; code
written for MASM must be in a separate file.
12.8.1 Writing the Assembly-Language Procedure
You must write your assembly-language procedure so that it uses the same
call-ing conventions and naming conventions as your C program. If you follow
these conventions, you will be able to write recursive procedures
(procedures that call themselves), and you will be able to use the CodeView
debugger to locate errors in the code.
────────────────────────────────────────────────────────────────────────────
NOTE
This section discusses only the simplified segment directives provided with
the Microsoft Macro Assembler, version 5.0. If you are using a version prior
to 5.0, you have to specify complete SEGMENT directives.
────────────────────────────────────────────────────────────────────────────
The standard assembly-language interface method consists of these steps:
1. Setting up the procedure
2. Entering the procedure
3. Allocating local data (optional)
4. Preserving register values
5. Accessing parameters
6. Returning a value (optional)
7. Exiting the procedure
The next sections describe each of these steps in detail.
12.8.2 Setting Up the Procedure
The linker cannot combine the assembly-language procedure with the C program
unless you define compatible segments and declare the procedure properly.
Perform the following steps to set up the procedure:
1. Use the .MODEL directive at the beginning of the source file; this
directive automatically causes the appropriate kind of returns to be
generated (NEAR for tiny, small or compact models, FAR for medium,
large, or huge models).
If you are using a version of MASM prior to 5.0, declare the procedure
NEAR for small or compact model, FAR for medium, large, or huge
models.
2. Use the simplified segment directives .CODE and .DATA to declare the
code and data segments.
If you are using a version of MASM prior to 5.0, declare the segments
using the SEGMENT, GROUP, and ASSUME directives. These directives are
described in the Microsoft Macro Assembler Reference .
3. Use the PUBLIC directive to declare the procedure label public. This
declaration makes the procedure visible to other modules. Also declare
any data you want to make public as PUBLIC.
4. Use the EXTRN directive to declare any global data or procedures
accessed by the routine as external. The safest way to use EXTRN is to
place the directive outside any segment definition; however, place
near data inside the data segment.
5. Observe the C naming convention; precede all procedure names and
global data names with an underscore.
12.8.3 Entering the Procedure
When you enter the procedure, in most cases you will want to set up a "stack
frame." This allows you to access parameters passed on the stack and to
allocate local data on the stack. You do not need to set up the stack frame
if your procedure accepts no arguments and does not use the stack.
To set up the stack frame, issue the instructions:
push bp
mov bp,sp
This sequence establishes BP as the frame pointer. You cannot use SP for
this purpose because it is not an index or base register. Also, the value of
SP may change as more data are pushed onto the stack. However, the value of
the base register BP remains constant for the life of the procedure unless
your program changes it, so each parameter can be addressed as an offset
from BP.
The instruction sequence above preserves the value of BP, since it will be
needed in the calling procedure as soon as your assembly-language procedure
returns. It then transfers the value in SP to BP to establish a stack frame
on entry to the procedure.
12.8.4 Allocating Local Data
Your assembly-language procedure can use the same technique for allocating
temporary storage for local data that is used by high-level languages. To
set up local data space, decrease the contents of SP just after setting up
the stack frame. (To ensure correct execution, always increase or decrease
SP by an even number.) Decreasing SP reserves space on the stack for local
data. You must restore the space at the end of the procedure as follows:
push bp
mov bp,sp
sub sp,space
In the example above, space is the total size in bytes of the local data
you want to allocate. Local variables are then accessed as fixed negative
displacements from BP.
In the following example, the entry sequence establishes a stack frame and
allocates temporary local storage for two words (4 bytes) of data. Later in
the example, the program accesses the local storage, initializing both to 0.
push bp ; Save old stack frame.
mov bp,sp ; Set up new stack frame.
sub sp,4 ; Allocate 4 bytes of local storage.
.
.
.
mov WORD PTR [bp-2],0
mov WORD PTR [bp-4],0
Note that local variables are also called dynamic, stack, or automatic
variables.
12.8.5 Preserving Register Values
A procedure called from C should preserve the values of SI, DI, SS, and DS
(in addition to BP, which is already saved). You should push any register
value that your procedure modifies onto the stack after setting up the stack
frame and allocating local storage, but prior to entering the main body of
the procedure. Registers that your procedure does not alter need not be
preserved.
────────────────────────────────────────────────────────────────────────────
WARNING
Routines that your assembly-language procedure calls must not alter the SI,
DI, SS, DS, or BP registers. If they do, and you have not preserved the
registers, they can corrupt the calling program's register variables,
segment registers, and stack frame, causing program failure. If your
procedure modifies the direction flag using the STD or CLD instructions, you
must preserve the flags register.
────────────────────────────────────────────────────────────────────────────
The example below shows an entry sequence that sets up a stack frame,
allocates 4 bytes of local data space on the stack, then preserves the SI,
DI, and flags registers.
push bp ; Save caller's stack frame.
mov bp,sp ; Establish new stack frame.
sub sp,4 ; Allocate local data space.
push si ; Save SI and DI registers.
push di
pushf ; Save the flags register.
.
.
.
In the example above, you must exit the procedure with the following code:
popf ; Restore the flags register.
pop di ; Restore the old value in the DI
register.
pop si ; Restore the old value in the SI
register.
mov sp,bp ; Restore the stack pointer.
pop bp ; Restore the frame pointer.
ret ; Return to the calling routine.
If you do not issue the instructions above in the order shown, you will
place incorrect data in registers. Follow the rules below when restoring the
calling program's registers, stack pointer, and frame pointer:
■ Pop all registers that you preserve in the reverse order from which
they were pushed onto the stack. So, in the example above, SI and DI
are pushed, and DI and SI are popped.
■ Restore the stack pointer by transferring the value of BP into SP
before restoring the value of the frame pointer.
■ Always restore the frame pointer last.
12.8.6 Accessing Parameters
Once you have established the frame pointer, allocated local storage (if
required), and pushed any registers that need to be preserved, you can write
the main body of the procedure. Figure 12.4 shows how functions that observe
the C calling convention use the stack frame.
(This figure may be found in the printed book.)
The stack frame for the assembly-language procedure shown in Figure 12.4 is
established by the following:
1. The calling program pushes each of the parameters onto the stack,
after which SP points to the last parameter pushed.
2. The calling program issues a CALL instruction, which causes the return
address (the place in the calling program to which control will
ultimately return) to be placed on the stack. This address can be
either two bytes long (for near calls) or four bytes long (for far
calls). SP now points to this address.
3. The first instruction of the called procedure saves the old value of
BP, with the instruction push bp. SP now points to the saved copy of
BP.
4. BP is used to hold the current value of SP, with the instruction mov
bp,sp. BP therefore now points to the old value of BP (saved on the
stack).
5. While BP remains constant throughout the procedure, SP is often
decreased to provide room on the stack for local data or saved
registers.
In general, the displacement (from BP) for a parameter x is equal to the
size of return address plus 2 plus the total size of parameters between x
and BP.
To calculate the size of parameters between x and BP, you must start with
the rightmost parameter because C pushes parameters from right to left. For
example, consider a FAR procedure that has one argument of type int (two
bytes). The displacement of the parameter is
Argument's displacement = size of far return address + 2
= 4 + 2
= 6
The argument can thus be loaded into BP with the following instruction:
mov bx,[bp+6]
Once you determine the displacement of each parameter, you can use EQU
directives or structures to refer to the parameter with a single identifier
name in your assembly source code. For example, you can use a more readable
name to reference the parameter at BP+6 if you put the following statement
at the beginning of the assembly source file:
Arg1 EQU [bp+6]
You can then refer to the first parameter in your source as Arg1 in any
instruction. Use of this feature is optional.
For far (segmented) addresses, Microsoft C pushes the segment address before
pushing the offset address. When pushing arguments larger than two bytes,
high-order words are always pushed before low-order words, and parameters
larger than two bytes are stored on the stack in most-significant,
least-significant order.
This standard for pushing segment addresses before pushing offset addresses
facilitates the use of the assembly-language instructions LDS (load data
segment) and LES (load extra segment).
12.8.7 Returning a Value
Your assembly-language procedure can return a value to a C calling program.
All return values of four bytes or less are passed in registers. Far
pointers to return values larger than four bytes are returned in the DX and
AX registers. The DX register contains the segment address; the AX register
contains the offset relative to the segment contained in DX.
Table 12.3 shows the register conventions for returning simple data types to
a C program.
Table 12.3 Register Conventions for Simple Return Values
╓┌─────────────────────────────────┌─────────────────────────────────────────╖
Data Type Registers
────────────────────────────────────────────────────────────────────────────
char AL
Data Type Registers
────────────────────────────────────────────────────────────────────────────
int, short, _near * AX
long, _far * High-order portion (or segment address)
in DX;
low-order portion (or offset address) in
AX
────────────────────────────────────────────────────────────────────────────
Your procedures can return structures.
To return a structure from a procedure that uses the C calling convention,
you must copy the structure to a global variable, then return a pointer to
that variable in the AX register (DX:AX, if you compiled in compact, large,
or huge model).
Procedures that use the FORTRAN/Pascal calling convention return structures
similarly, with the following exceptions:
■ The calling program allocates space for the return value on the stack.
■ The calling program passes a pointer to the location where the return
value is to be placed in a hidden parameter.
■ Instead of copying your structure into a global data item, you copy it
into the location pointed to by the hidden parameter.
■ You must still return the pointer to that location in the AX register
(or DX:AX for far data models).
You can return floating-point values from your procedures.
Procedures that use the C calling convention and return type float or type
double must always copy their return values into the global variable fac. To
return floating-point values from procedures declared with the
FORTRAN/Pascal calling convention, you must return the result on the stack,
just as you would a structure.
To return a value of type long double, you must place the value on the
NDP(80x87) stack using the FLD instruction. The C run-time math routines
guarantee that the only value on the NDP stack is a return value; your
routines must observe the same rule.
12.8.8 Exiting the Procedure
Before you exit your assembly-language procedure, you must perform several
steps to restore the calling program's environment. Some of these steps are
dependent on actions you took in allocating space for local variables and
preserving registers.
You must follow these steps (if appropriate to your procedure) in the order
shown:
1. If you saved any of the registers SS, DS, SI, or DI, they must be
popped off the stack in the reverse order from which they were saved.
If you pop these registers in any other order, your program will
behave incorrectly.
2. If you allocated local data space at the beginning of the procedure,
you must restore SP with the instruction mov s p ,bp.
3. Restore BP with the instruction pop bp. This step is always
necessary.
4. Return to the calling program by issuing the ret instruction.
The following example shows the simplest possible entry and exit sequence.
In the entry sequence, no registers are saved and no local data space is
allocated.
push bp
mov bp,sp ; Set up the new stack frame.
.
.
.
pop bp ; Restore the caller's stack frame.
ret
The following example shows an entry and exit sequence for a procedure that
saves SI and DI and allocates local data space on the stack.
push bp
mov bp,sp ; Establish local stack frame.
sub sp,4 ; Allocate space for local data.
push si ; Preserve the SI and DI registers.
push di
.
.
.
pop di ; Pop saved registers.
pop si
mov sp,bp ; Free local data space.
pop bp ; Restore old stack frame.
ret
12.9 Handling Data in Mixed-Language Programming
This section contains detailed information about naming and calling
conventions in a mixed-language program. It also describes how various
languages represent strings, numerical data, arrays, and logical data.
12.9.1 Default Naming and Calling Conventions
Each language has its own default naming and calling conventions (Table
12.4).
Table 12.4 Default Naming and Calling Conventions
╓┌─────────┌──────────────────┌──────────────────┌───────────────────────────╖
Calling Naming Parameter
Language Convention Convention Passing
────────────────────────────────────────────────────────────────────────────
BASIC FORTRAN/Pascal Case insensitive Near reference
Calling Naming Parameter
Language Convention Convention Passing
────────────────────────────────────────────────────────────────────────────
C C Case sensitive Value (scalar variables),
reference (arrays and
pointers)
FORTRAN FORTRAN/Pascal Case insensitive Reference
Pascal FORTRAN/Pascal Case insensitive Value
────────────────────────────────────────────────────────────────────────────
BASIC Conventions
When you call BASIC routines from C, you must pass all arguments by near
reference (near pointer). You can modify the conventions observed by BASIC
routines that interface with C functions by using the DECLARE, BYVAL, SEG,
and CALLS keywords. For more information on these keywords, see the
Microsoft BASIC Language Reference or the Microsoft BASIC Programmer's
Guide.
FORTRAN Conventions
You can modify the conventions observed by FORTRAN routines that call C
functions by using the INTERFACE, VALUE, PASCAL, and C keywords. For more
information about the use of these keywords, see the Microsoft FORTRAN
Reference.
Pascal Conventions
You can modify the conventions observed by Pascal routines that interface
with C functions by using the VAR, CONST, ADR, VARS, CONSTS, ADRS, and C
keywords. For more information about the use of these keywords, see the
Microsoft Pascal Compiler User's Guide.
12.9.2 Numeric Data Representation
Table 12.5 shows how to declare numeric variables of similar type in
different languages.
Table 12.5 Equivalent Numeric Data Types
╓┌─────────────┌───────────────────┌──────────────────────┌──────────────────╖
BASIC C FORTRAN Pascal
────────────────────────────────────────────────────────────────────────────
x% short INTEGER*2 INTEGER2
INTEGER int --- INTEGER
(default)
--- unsigned short(1) --- WORD
--- unsigned --- ---
x& long INTEGER*4 INTEGER4
BASIC C FORTRAN Pascal
────────────────────────────────────────────────────────────────────────────
x& long INTEGER*4 INTEGER4
LONG --- INTEGER (default) ---
--- unsigned long(1) --- ---
x! float REAL*4 REAL4
x (default) --- REAL REAL (default)
SINGLE --- --- ---
x# double REAL*8 REAL8
DOUBLE --- DOUBLE ---
PRECISION
--- long double REAL*16 REAL16
BASIC C FORTRAN Pascal
────────────────────────────────────────────────────────────────────────────
--- unsigned char CHARACTER*1(2) CHAR
────────────────────────────────────────────────────────────────────────────
(1) Types unsigned short and unsigned long are not supported by BASIC or
FORTRAN. Type unsigned long is not supported by Pascal. A signed integral
type can be substituted, but the maximum range will be less.
(2) The FORTRAN type CHARACTER*1 is not the same as LOGICAL.
The FORTRAN types COMPLEX*8 and COMPLEX*16 are not implemented in C but can
be represented with structures.
The FORTRAN types LOGICAL*2 and LOGICAL*4 are not implemented in C.
LOGICAL*2 is stored as a one-byte Boolean indicator followed by an unused
byte; LOGICAL*4 is stored as a one-byte Boolean indicator followed by three
unused bytes.
12.9.3 Strings
Each language implements strings differently. This section describes the
ways that strings are implemented in Microsoft languages.
C String Format
C stores strings as arrays of bytes and uses a null character ( '\0' ) as
an end-of-string delimiter. For example, consider the following string:
char c_string[] = "C text string";
This string is represented in memory as follows:
(This figure may be found in the printed book.)
Because c_string is an array like any other, C passes it by reference in
function calls.
BASIC String Format
BASIC stores strings as four-byte descriptors pointing to the actual string
data. The format of the descriptor is as follows:
(This figure may be found in the printed book.)
The first field of the string descriptor contains an integer indicating the
length (in bytes) of the string. The second field contains the address of
the string in the default data segment.
Do not attempt to alter the length of BASIC strings, because they are
managed by BASIC string-space management routines. You cannot count on a
particular string remaining at a given offset during the execution of a
BASIC program because the BASIC string-space management routines allocate
strings to different areas of memory depending on program requirements.
The format of the string at DS:Address is a simple array of characters. The
string is exactly the length indicated in the descriptor.
To pass a BASIC string to C, append a null character.
Because C needs the null character to delimit the end of the string, you
should append chr$( 0 ) to your BASIC string before passing it to your C
function. For example,
A$ = "I am a BASIC string"
A$ = A$ + chr$( 0 )
CALL CFunc( SADD(A$) )
Note that the BASIC call is made by near reference using the SADD keyword.
Use a string descriptor to pass a C string to BASIC.
To pass a C string to BASIC, create a structure for the string descriptor.
For example,
char c_string[] = "C String Data";
struct tagBASICStringDes
{
char * sd_addr;
int sd_len;
} str_des;
str_des.sd_addr = c_string;
str_des.sd_len = strlen( c_string );
BASICFunction( &str_des );
FORTRAN String Format
FORTRAN stores strings as a series of bytes at a fixed location in memory.
There is no delimiter at the end of the string. Consider the string declared
as follows:
STR = 'FORTRAN STRING'
The string is stored in memory as follows:
(This figure may be found in the printed book.)
FORTRAN passes strings by reference, as it does all other data.
────────────────────────────────────────────────────────────────────────────
NOTE
FORTRAN's variable length strings cannot be used in mixed-language
programming because the temporary variable used to communicate string length
is not accessible to other languages.
────────────────────────────────────────────────────────────────────────────
To pass a C string to FORTRAN (or Pascal), pass the variable by reference as
you normally would. In your FORTRAN or Pascal routine, you must specify the
length of the string; strings that are passed as arguments from one language
to another must be of fixed length.
Pascal String Format
Pascal represents strings as fixed-length arrays of CHAR or as strings with
a length byte followed by the string data.
To pass a fixed-length string to C, append a null character.
To pass a fixed-length string to a C function, use the concatenation
operator (*) to append a null character. Then pass the string to the C
function by reference (by declaring the string as CONST, CONSTS, VAR, or
VARS). For example,
PROGRAM PasStr( input, output );
type
stype15 = string(15); { fixed-length }
var
str : stype15;
PROCEDURE PasStrToC( VAR s1 : stype15 ) [C]; EXTERN;
BEGIN
str := 'Pass this to C' * chr( 0 );
PasStrToC( str );
END.
A more flexible way to pass Pascal strings to C functions is to declare them
as type ADRMEM or ADSMEM, then pass the address of the string. For example,
PROCEDURE PasStrToC( s1adr : ADRMEM ) [C]; EXTERN;
Then you can call the C function with this code:
PasStrToC( ADR str );
Using this method, you can pass strings of different lengths to C functions.
────────────────────────────────────────────────────────────────────────────
NOTE
The Pascal type LSTRING is not compatible with C; you can pass a string
declared as LSTRING by first assigning it to another variable of type
STRING, then passing that variable.
────────────────────────────────────────────────────────────────────────────
Whenever you pass a variable of type STRING or type LSTRING by value, Pascal
pushes the whole string onto the stack and passes the length of the string
as another parameter. C cannot access strings passed in this manner.
Before passing a string from C to Pascal, make sure enough space is
allocated.
Passing a string from a C function to a Pascal function or procedure is
identical to passing a string from a C function to a FORTRAN routine. The
only provision you must make is to specify the length of the string to your
Pascal function.
12.9.4 Arrays
When you use an array in a program written in a single language, the method
for array handling is consistent. When you mix languages, you need to be
aware of the differences between array-handling techniques in various
languages.
Unlike most Microsoft languages, BASIC keeps an array descriptor, which is
similar to the BASIC string descriptor discussed in Section 12.9.3,
"Strings." This array descriptor is necessary because BASIC handles memory
allocation for arrays dynamically (at run time). Dynamic allocation requires
BASIC to shift arrays in memory.
To pass a BASIC array to a C function, use the VARPTR and VARSEG keywords.
The VARPTR and VARSEG keywords obtain the address of the first element of
the array and its segment, respectively. The example below shows how to call
a C function with a near reference and a far reference to an array:
DIM ARRAY%( 20 )
DECLARE CNearArray CDECL( BYVAL Addr AS INTEGER )
DECLARE CFarArray CDECL( BYVAL Addr AS INTEGER, BYVAL Seg AS INTEGER )
.
.
.
CALL CNearArray( VARPTR( ARRAY%(0) ) )
CALL CFarArray( VARPTR( ARRAY%(0) ), VARSEG( ARRAY%(0) ) )
The C functions receiving ARRAY can be declared as follows:
_cdecl CNearArray( int * array );
_cdecl CFarArray( int far * array );
The routine that receives the array must not make a call back to BASIC. If
it does, the location of the array data could change, and the address that
was passed to the routine would become meaningless.
If you only need to pass one member of the array from BASIC to your C
function, you can pass it by value as follows:
CALL CFunc( ARRAY%(8) )
12.9.5 Array Declaration and Indexing
Each language varies in the way that arrays are declared and indexed. Array
indexing is a source-level consideration and involves no transformation of
data. There are two differences in the way elements are indexed by each
language:
1. The value of the lower array bound is different among Microsoft
languages.
By default, FORTRAN indexes the first element of an array as 1. BASIC
and C index it as 0. Pascal lets you begin indexing at any integer
value. Recent versions of BASIC and FORTRAN also give you the option
of specifying lower bounds at any integer value.
2. Some languages vary subscripts in row-major order; others vary
subscripts in column-major order.
This issue only affects arrays with more than one dimension. With
row-major order (used by C and Pascal), the rightmost dimension
changes first. With column-major order (used by FORTRAN, and BASIC by
default), the leftmost dimension changes first. Thus, in C, the first
four elements of an array declared as X[3][3] are
X[0][0] X[0][1] X[0][2] X[1][0]
In FORTRAN, the four elements are
X(1,1) X(2,1) X(3,1) X(1,2)
The C and FORTRAN arrays shown above illustrate the difference between
row-major and column-major order as well as the difference in the
assumed lower bound between C and FORTRAN. Table 12.6 shows
equivalences for array declarations in each language. In this table, r
is the number of elements of the row dimension (which changes most
slowly), and c is the number of elements of the column dimension
(which changes most quickly).
Table 12.6 Equivalent Array Declarations
╓┌─────────┌────────────────────────────────┌────────────────────────────────╖
Language Array Declaration Notes
────────────────────────────────────────────────────────────────────────────
BASIC DIM x(r-1, c-1) With default lower bounds of 0
Language Array Declaration Notes
────────────────────────────────────────────────────────────────────────────
C type x[r][c] When passed by reference
struct { type x[r][c]; } x When passed by value
FORTRAN type x(c, r) With default lower bounds of 1
Pascal x : ARRAY [a..a+r-1, b..b+c-1]
OF type
────────────────────────────────────────────────────────────────────────────
The order of indexing extends to any number of dimensions you declare. For
example, the C declaration
int arr1[2][10][15][20];
is equivalent to the FORTRAN declaration
INTEGER*2 ARR1( 20, 15, 10, 2 )
The constants used in a C array declaration represent dimensions, not upper
bounds as they do in other languages. Therefore, the last element in the C
array declared as int arr[5][5] is arr[4][4], not arr[5][5].
12.9.6 Structures, Records, and User-Defined Types
The C struct type, the BASIC user-defined type, the FORTRAN record (defined
with the STRUCTURE keyword), and the Pascal record type are equivalent.
Therefore, these data types can be passed between C, FORTRAN, Pascal, and
BASIC.
These types can be affected by the storage method. By default, C, FORTRAN,
and Pascal use word alignment for types shorter than one word (type char and
unsigned char). This storage method specifies that occasional bytes can be
inserted as padding so that word and double-word objects start on an even
boundary. (In addition, all nested structures and records start on a word
boundary.)
If you are passing a structure or record across a mixed-language interface,
your calling routine and called routine must agree on the storage method and
parameter-passing convention. Otherwise, data will not be interpreted
correctly.
Because Pascal, FORTRAN, and C use the same storage method for structures
and records, you can interchange data between routines without taking any
special precautions unless you modify the storage method. Make sure the
storage methods agree before interchanging data between C, FORTRAN, and
Pascal.
BASIC packs user-defined types, so your C function must also pack structures
(using the /Zp command-line option or the pack pragma) to agree.
You can pass structures as parameters by value or by reference. Both the
calling program and the called program must agree on the parameter-passing
convention. See Section 12.2.3, "Parameter-Passing Requirement," for more
information about the language you are using.
12.9.7 External Data
External data refers to data that is both static and public; that is, the
data is stored in a set place in memory as opposed to being allocated on the
stack, and the data is visible to other modules.
External data can be defined in C, Pascal, and assembly language. Note that
a data definition is distinct from an external declaration. A data
definition causes a compiler to create a data object; an external
declaration informs a compiler that the object is to be found in another
module. FORTRAN can only define external data in COMMON blocks. (See Section
12.9.9, "Common Blocks," for more information about sharing external data
with FORTRAN programs.)
There are three requirements for programs that share external data between
languages:
1. One of the modules must define the data.
You can define a static data object in a C module by defining a data
object outside all functions. (If you use the static keyword in C,
however, the data object will not be made public.)
2. The other modules that will access the data must declare the data as
external.
In C, you can declare data as external by using an extern declaration,
similar to the extern declaration for functions. In FORTRAN and
Pascal, you can declare data as external by adding the EXTERN
attribute to the data declaration.
3. Resolve naming-convention differences.
In C, you can adopt the FORTRAN/Pascal naming convention by applying
_fortran or _pascal to the data declaration. In FORTRAN and Pascal,
you can adopt the C naming convention by applying the C attribute to
the data declaration.
12.9.8 Pointers and Address Variables
Rather than passing data directly, you may want to pass the address of a
piece of data. Passing the address amounts to passing the data by reference.
In some cases, such as in BASIC arrays, there is no other way to pass a data
item as a parameter.
C programs always pass array variables by address. All other types are
passed by value unless you use the address-of (&) operator to obtain the
address.
The Pascal ADR and ADS types are equivalent to near and far pointers,
respectively, in C. You can pass ADR and ADS variables as ADRMEM or ADSMEM.
BASIC and FORTRAN do not have formal address types. However, they do provide
ways for storing and passing addresses.
BASIC programs can access a variable's segment address with the VARSEG
function and its offset address with the VARPTR function. The values
returned by these intrinsic functions should then be passed or stored as
ordinary integer variables. If you pass them to another language, pass by
value. Otherwise you will be attempting to pass the address of the address,
rather than the address itself.
To pass a near address, pass only the offset; if you need to pass a far
address, you may have to pass the segment and the offset separately. Pass
the segment address first, unless CDECL is in effect.
FORTRAN programs can determine near and far addresses with the LOC and
LOCFAR functions. Store the result of the LOC function as INTEGER*2 and the
result of the LOCFAR function as INTEGER*4.
As with BASIC, if you pass the result of LOC or LOCFAR to another language,
be sure to pass by value.
12.9.9 Common Blocks
You can pass individual members of a FORTRAN or BASIC common block in an
argument list, just as you can any data item. However, you can also give a
different language module access to the entire common block at once.
C modules can reference the items of a common block by first declaring a
structure with fields that correspond to the common-block variables. Having
defined a structure with the appropriate fields, the C module must then
connect with the common block itself. The next two sections present methods
for gaining access to common blocks.
Passing the Address of a Common Block
To pass the address of a common block, simply pass the address of the first
variable in the block. (In other words, pass the first variable by
reference.) The receiving C module should expect to receive a structure by
reference.
In the example below, the C function initcb receives the address of the
variable N, which it considers to be a pointer to a structure with three
fields:
C FORTRAN SOURCE CODE
C
COMMON /CBLOCK/N, X, Y
INTEGER*2 N
REAL*8 X, Y
.
.
.
CALL INITCB( N )
/* C source code */
/* Explicitly set structure packing to word-alignment */
#pragma pack( 2 );
struct block_type
{
int n;
double x;
double y;
};
initcb( struct block_type * block_hed )
{
block_hed-n = 1;
block_hed-x = 10.0;
block_hed-y = 20.0;
}
Accessing Common Blocks Directly
You can access FORTRAN common blocks directly by defining a structure with
the appropriate fields and then using the methods described in Section
12.9.7, "External Data." Here is an example of accessing common blocks
directly:
struct block_type
{
int n;
double x;
double y;
};
extern struct block_type fortran cblock;
You cannot access common blocks directly using BASIC common blocks.
Note that the technique of accessing common blocks directly works with
FORTRAN common blocks, but not with BASIC common blocks. If your C module
must work with both FORTRAN and BASIC common blocks, pass the address of the
common block as a parameter to the function.
12.9.10 Using a Varying Number of Parameters
Some C functions (for example printf) accept a variable number of
parameters. To call such a function from another language, you need to
suppress the type-checking that normally forces a call to be made with a
fixed number of parameters. In BASIC, you can remove this type-checking by
omitting a parameter list from the DECLARE statement. In FORTRAN or Pascal,
you can call routines with a variable number of parameters by including the
VARYING attribute in your interface to the routine, along with the C
attribute. You must use the C attribute because a variable number of
parameters is feasible only with the C calling convention.
Chapter 13 Writing Portable Programs
────────────────────────────────────────────────────────────────────────────
Because C compilers exist on a variety of computers, some C applications
developed for one computer system can be ported to other systems. However,
some aspects of language behavior depend on how a particular C compiler is
implemented and how a specific computer operates. Therefore, when designing
a program to be ported to another system, it is important that you examine
programming assumptions.
This chapter describes programming assumptions that can affect writing
portable programs.
The American National Standards Institute Standard for the C Language (the
ANSI Standard) details every instance where language behavior is defined by
the implementation. Appendix C summarizes implementation-defined behavior
for Microsoft C.
13.1 Assumptions about Hardware
To make C programs portable, you must examine two aspects of your code:
hardware assumptions and compiler dependency. This section deals with
hardware assumptions. Section 13.2, "Assumptions about the Compiler," deals
with compiler dependency.
13.1.1 Size of Basic Types
In C, the size of basic types (char, signed int, unsigned int, float,
double, and long double) is implementation-defined, so relying on a
particular data type to be a given size reduces the portability of a
program.
Don't make assumptions about the size of data types.
Because the size of basic types is left to the implementation, do not make
assumptions about the size or alignment of data types within aggregate
types. Use only the sizeof operator to determine the size or amount of
storage required for a variable or a type.
Following are some rules governing the size of data types.
Type char
Type char is the smallest of the basic types, but it must be large enough to
hold any of the characters in the implementation's basic character set.
Normally, variables of type char are one byte.
Type int and Type short int
Type int and type short int often correspond to the register size of the
target machine. Both int and short are greater than or equal to the size of
type char but less than or equal to the size of type long.
If you assume that type int is a certain size, your code may not be portable
because
■ An int can be defined as a 16-bit (two-byte) or a 32-bit quantity.
■ An int is not always large enough to hold array indexes. For large
arrays, you must use unsigned int; for extremely large arrays, use
long. To be certain your code is portable, define your array indexes
as type size_t. You may not know, before porting your code, the
maximum value to expect an array index of type int to hold. The file
LIMITS.H contains manifest constants, listed below, for the maximum
and minimum values of each basic integral type.
Constant Value
────────────────────────────────────────────────────────────────────────────
CHAR_BIT Number of bits in a variable of type
char
CHAR_MIN Minimum value a variable of type char
can hold
CHAR_MAX Maximum value a variable of type char
can hold
SCHAR_MIN Minimum value a variable of type signed
char
can hold
SCHAR_MAX Maximum value a variable of type signed
char
can hold
UCHAR_MAX Maximum value a variable of type
unsigned char can hold
SHRT_MIN Minimum value a variable of type short
can hold
SHRT_MAX Maximum value a variable of type short
can hold
USHRT_MAX Maximum value a variable of type
unsigned short can hold
INT_MIN Minimum value a variable of type int can
hold
INT_MAX Maximum value a variable of type int can
hold
UINT_MAX Maximum value a variable of type
unsigned int
can hold
LONG_MIN Minimum value a variable of type long
can hold
LONG_MAX Maximum value a variable of type long
can hold
ULONG_MAX Maximum value a variable of type
unsigned long can hold
Type float, Type double, and Type long double
Type float is the smallest of the basic floating-point types. Type double is
usually larger than type float, and type long double is usually the largest
of the floating-point types. You can make only these portability assumptions
about floating-point types:
■ Any value that can be represented as type float can be represented as
type double (type float is a subset of type double).
■ Any value that can be represented as type double can be represented as
type long double (type double is a subset of type long double).
The file FLOAT.H contains manifest constants, listed below, for the maximum
and minimum values of each basic floating-point type.
Constant Value
────────────────────────────────────────────────────────────────────────────
DBL_DIG Number of decimal digits of precision a
variable of type double can hold
DBL_MAX Maximum value a variable of type double
can hold
DBL_MAX_10_EXP Maximum value (base 10) the exponent of
a variable of type double can hold
DBL_MAX_EXP Maximum value (base 2) the exponent of a
variable of type double can hold
DBL_MIN Minimum positive value a variable of
type double can hold
DBL_MIN_10_EXP Minimum value (base 10) the exponent of
a variable of type double can hold
DBL_MIN_EXP Minimum value (base 2) the exponent of a
variable of type double can hold
FLT_DIG Number of decimal digits of precision a
variable of type float can hold
FLT_MAX Maximum value a variable of type float
can hold
FLT_MAX_10_EXP Maximum value (base 10) the exponent of
a variable of type float can hold
FLT_MAX_EXP Maximum value (base 2) the exponent of a
variable of type float can hold
FLT_MIN Minimum positive value a variable of
type float can hold
FLT_MIN_10_EXP Minimum value (base 10) the exponent of
a variable of type float can hold
FLT_MIN_EXP Minimum value (base 2) the exponent of a
variable of type float can hold
LDBL_DIG Number of decimal digits of precision a
variable of type long double can hold
LDBL_MAX Maximum value a variable of type long
double can hold
LDBL_MAX_10_EXP Maximum value (base 10) the exponent of
a variable of type long double can hold
LDBL_MAX_EXP Maximum value (base 2) the exponent of a
variable of type long double can hold
LDBL_MIN Minimum positive value a variable of
type long double can hold
LDBL_MIN_10_EXP Minimum value (base 10) the exponent of
a variable of type long double can hold
LDBL_MIN_EXP Minimum value (base 2) the exponent of a
variable of type long double can hold
Microsoft C Type Sizes
Table 13.1 summarizes the size of the basic types in Microsoft C.
Table 13.1 Size of Basic Types in Microsoft C
╓┌─────────────────────────────────────┌─────────────────────────────────────╖
Number
Type of Bytes
────────────────────────────────────────────────────────────────────────────
char, unsigned char 1
int, short, unsigned int, 2
unsigned short
near pointer 2
Number
Type of Bytes
────────────────────────────────────────────────────────────────────────────
long, unsigned long 4
far pointer 4
float 4
double 8
long double 10
────────────────────────────────────────────────────────────────────────────
13.1.2 Storage Order and Alignment
The C language does not define any specific layout for the storage of data
items relative to one another. The layout for storage of structure elements,
or unions within a structure or union, is defined by the implementation.
Some processors require that data longer than one byte be word-aligned
(aligned to an even-byte address). Other processors, such as the 80x86
family, do not have such a restriction.
Structure Order and Alignment
The example below illustrates how alignment can affect your program. In the
example, a structure is cast to type long because the programmer knew the
order in which a particular implementation stored data.
/* Nonportable code */
struct time
{
char hour; /* 0 < hour < 24 -- fits in a char */
char minute; /* 0 < minute < 60 -- fits in a char */
char second; /* 0 < second < 60 -- fits in a char */
};
.
.
.
struct time now, alarm_time;
.
.
.
if ( (long)now >= (long)alarm_time )
{
/* sound an alarm */
}
The preceding code makes these nonportable assumptions:
■ The data for hour will be stored in a higher order position than
minute or second. Because C does not guarantee storage order or
alignment of structures or unions, the code may not be portable to
other machines.
■ Three variables of type char will be shorter than or the same length
as a variable of type long. Thus, the code is not portable according
to the rules governing the size of basic types, as described in
Section 13.1.1.
If either of these assumptions proves false, the comparison (if statement)
is invalid.
You can write code that makes no assumptions about storage order.
To make the program in the preceding example portable, you can break the
comparison between the two long integers into a component-by-component
comparison. This technique is illustrated in the following example:
/* Portable code */
struct time
{
char hour; /* 0 < hour < 24 -- fits in a char */
char minute; /* 0 < minute < 60 -- fits in a char */
char second; /* 0 < second < 60 -- fits in a char */
};
.
.
.
struct time now, alarm_time;
.
.
.
if ( time_cmp( now, alarm_time ) >= 0 )
{
/* sound an alarm */
}
.
.
.
int time_cmp( struct time t1, struct time t2 )
{
if( t1.hour != t2.hour )
return( t2.hour - t1.hour );
if( t1.minute != t2.minute )
return( t2.minute - t1.minute );
return( t2.second - t1.second );
}
Union Order and Alignment
Programmers use unions most often for two purposes: to store data whose
exact type is not known until run time or to access the same data in
different ways.
Unions falling into the second category are usually not portable. For
example, the union below is not portable:
union tag_u
{
char bytes_in_long[4];
long a_long;
};
The intent of the union above is to access the individual bytes of a
variable of type long. However, the union may not work as intended when
ported to other computers because
■ It relies on a constant size for type long.
■ It may assume byte ordering within a variable of type long. (Byte
ordering is described in detail in Section 13.1.3, "Byte Order in a
Word.")
The first problem can be addressed by coding the union as follows:
union tag_u
{
char bytes_in_long[sizeof( long ) / sizeof( char )];
long a_long;
};
Note the use of the sizeof operator to determine the size of a data type.
13.1.3 Byte Order in a Word
The order of bytes within a word (int or short) or a double word (long) can
vary among machines. Code that assumes an internal order is not portable, as
shown by this example:
/*
* Nonportable structure to access an
* int in bytes.
*/
struct tag_int_bytes
{
char lobyte;
char hibyte;
};
A more portable way to access the individual bytes in a word is to define
two macros that rely on the constant CHAR_BIT, defined in LIMITS.H:
#define LOBYTE(a) (char)((a) & 0xff)
#define HIBYTE(a) (char)((unsigned)(a) >> CHAR_BIT)
The LOBYTE macro is still not completely portable. It assumes that a char is
eight bits long, and it uses the constant 0xff to mask the high-order
eight bits. Because portable programs cannot rely on a given number of bits
in a byte, consider the revision below:
#define LOBYTE(a) (char)((a) & ((unsigned)~0>>CHAR_BIT))
#define HIBYTE(a) (char)((unsigned)(a) >> CHAR_BIT)
The new LOBYTE macro performs a bitwise complement on 0; that is, all zero
bits are turned into ones. It then takes that unsigned quantity and shifts
it right far enough to create a mask of the correct length for the
implementation.
The following code assumes that the order of bytes in a word will be
leastsignificant first:
int c;
.
.
.
fread( &c, sizeof( char ), 1, fp );
The code attempts to read one byte as an int, without converting it from a
char. However, the code will fail in any implementation where the low-order
byte is not the first byte of an int. The following solution is more
portable. In the example below, the data is read into an intermediate
variable of type char before being assigned to the integer variable.
int c;
char ch;
.
.
.
fread( &ch, sizeof( char ), 1, fp );
c = ch;
The example below shows how to use the C run-time function fgetc to return
the value. The fgetc function returns type char, but the value is promoted
to type int when it is assigned to a variable of type int.
int c;
.
.
.
c = fgetc( fp );
Microsoft C Specific
Microsoft C normally aligns data types longer than one byte to an even-byte
address for improved performance. See the /Zp compiler option and the pack
pragma in the Microsoft C Reference and in on-line help for information
about controlling structure packing in Microsoft C.
13.1.4 Reading and Writing Structures
Many C programs read data from disk into structures and write data to disk
from structures. The functions that perform disk I/O in C require you to
specify the number of bytes to be transferred. You should always use the
sizeof operator to obtain the size of the data to be read or written because
differing data type sizes or alignment schemes may alter the size of a given
structure. For example,
fread( &my_struct, sizeof(my_struct), 1, fp );
Microsoft C Specific
When performing disk input and output in Microsoft C, structures may be
different sizes depending on the structure-packing option you have selected
(see the /Zp compiler option and the pack pragma in the Microsoft C
Reference).
13.1.5 Bit Fields in Structures
The Microsoft C compiler implements bit fields. However, many C compilers
do not.
Bit fields allow you to access the individual bits within a data item. While
the practice of accessing the bits in a data item is inherently nonportable,
you can
improve your chances of porting a program that uses bit fields if you make
no assumptions about order of assignment, or size and alignment of bit
fields.
Order of Assignment
The order of assignment of bit fields in memory is left to the
implementation, so you cannot rely on a particular entry in a bit field
structure to be in a higher order position than another. (This problem is
similar to the portability constraint imposed by alignment of basic data
types in structures. The C language does not define any specific layout for
the storage of data items relative to one another.) See Section 13.1.2,
"Storage Order and Alignment" for more information.
Size and Alignment of Bit Fields
The Microsoft C compiler supports bit fields up to the size of the type
long. Each individual member of the bit field structure can be up to the
size of the declared type. Some compilers do not support bit field-structure
elements that are longer than type int.
The example below defines a bit field, short_bitfield, that is shorter than
type int:
struct short_bitfield
{
unsigned usr_bkup : 1; /* 0 <= usr_bkup < 1 */
unsigned usr_sec : 4; /* 9 <= usr_sec < 16 */
};
The example below defines a bit field, long_bitfield, that has elements
longer than type int:
struct long_bitfield
{
unsigned long disk_pos : 22; /* 0 <= disk_pos < 4,194,304 */
unsigned long rec_no : 10; /* 0 <= rec_no < 1,024 */
};
The bit field short_bitfield is likely to be supported by more
implementations than long_bitfield.
Microsoft C Specific
The example below introduces another portability issue: alignment of data
defined in bit fields. The Microsoft C compiler does not allow an element in
a structure to extend across two words. The first two elements, day and
month, take up nine bits. The third, year, would extend across a word
boundary, so it must begin on the next word boundary.
struct long_bitfield
{
unsigned int day : 5; /* 0 <= day < 32 */
unsigned int month : 4; /* 0 <= month < 16 */
unsigned int year : 11; /* 0 <= year < 2048 */
};
Figure 13.1 illustrates the example above.
(This figure may be found in the printed book.)
Other compilers may not use the same storage techniques.
13.1.6 Processor Arithmetic Mode
Two types of arithmetic are common on digital computers: one's-complement
arithmetic and two's-complement arithmetic. Some programs assume that all
target computers perform two's-complement arithmetic. If you take advantage
of the fact that a given operation causes a particular bit pattern to be set
on either a one's-complement or two's-complement computer, your program will
not be portable. For example, two's-complement machines represent the
eight-bit integer value -1 as a binary 11111111. A one's-complement machine
represents the same decimal value (-1) as 11111110. Some programmers assume
that -1 will fill a byte or a word with ones, and use it to construct a mask
template that they later shift. This will not work correctly on
one's-complement machines, but the error will not surface until the
least-significant bit is used.
In two's-complement arithmetic, there is only one value that represents
zero. In one's-complement arithmetic, there is a value for zero and a value
for negative zero. Use the C relational operators to handle this anomaly
correctly; if you write code that deliberately circumvents the C relational
operators, tests for zero or NULL may not operate correctly.
Microsoft C Specific
Microsoft C produces code only for the Intel 80x86 processors, which all
perform two's-complement arithmetic.
13.1.7 Pointers
One of the most powerful but potentially dangerous features of the C
language is its use of indirect addressing through pointers. Bugs introduced
by misusing pointers can be difficult to detect and isolate because the
error often corrupts memory unpredictably.
Casting Pointers
Be sure you do not make nonportable assumptions when casting pointers to
different types.
/* Nonportable coercion */
char c[4];
long *lp;
lp = (long *)c;
*lp = 0x12345678L;
This code is nonportable because using a cast to change an array of char to
a pointer of type long assumes a particular byte-ordering scheme. This is
discussed in greater detail in Section 13.1.3, "Byte Order in a Word."
Pointer Size
A pointer can be assigned (or cast) to any integer type large enough to hold
it, but the size of the integer type depends on the machine and the
implementation. (In fact, it can even depend on the memory model.)
Therefore, you cannot assume:
sizeof( char * ) == sizeof( int )
To determine the size of any unmodified data pointer, use
sizeof( void * )
the size of a generic data pointer.
Pointer Subtraction
Code that assumes that pointer subtraction yields an int value is
nonportable. Pointer subtraction yields a result of type ptrdiff_t (defined
in STDDEF.H). Portable code must always use variables of type ptrdiff_t for
storing the result of pointer subtraction.
The Null Pointer
In most implementations, NULL is defined as 0. In Microsoft C, it is defined
as ((void *)0). Because code pointers and data pointers are often different
sizes, using 0 for the null pointer for both can lead to nonportability. The
difference in size between code pointers and data pointers will cause
problems for functions that expect pointer arguments longer than an int. To
avoid these problems, use the null pointer, as defined in the include file
STDDEF.H; use prototypes; or explicitly cast NULL to the correct data type.
Here is a portable way to use the null pointer:
/* Portable use of the NULL pointer */
main()
{
func1( (char *)NULL );
func2( (void *(*)())NULL );
}
void func1( char * c )
{
}
void func2( void *(* func)() )
{
}
The invocations of func1 and func2 explicitly cast NULL to the correct
size. In the case of func1, NULL is cast to type char *; in the case of
func2, it is cast to a pointer to a function that returns type void.
Microsoft C Specific
Subtraction of pointers to huge arrays that have more than 32,767 elements
may yield a long result. The _huge keyword is implementation-defined by
Microsoft C and is not portable. Here is how to subtract pointers to huge
arrays:
char _huge *a;
char _huge *b;
long d;
.
.
.
d = (long)( a - b );
In Microsoft C, the memory model selected and the special keywords _near,
_far, and _huge can change the size of a pointer. The Microsoft memory
models and extended keywords are nonportable, but you should be aware of
their effects.
Sizes of generic pointers and default pointer sizes are shown in Tables 13.2
and 13.3, respectively.
Table 13.2 Size of Generic Pointers
╓┌──────────────┌──────────────────────┌─────────────────────────────────────╖
Declaration Name Size
────────────────────────────────────────────────────────────────────────────
void _near * Generic near pointer 16 bits
void _far * Generic far pointer 32 bits
void _huge * Generic huge pointer 32 bits
Declaration Name Size
────────────────────────────────────────────────────────────────────────────
void _huge * Generic huge pointer 32 bits
────────────────────────────────────────────────────────────────────────────
Table 13.3 Default Pointer Sizes
╓┌─────────────┌──────────────────┌──────────────────────────────────────────╖
Memory Model Code Pointer Size Data Pointer Size
────────────────────────────────────────────────────────────────────────────
Tiny 16 bits 16 bits
Small 16 bits 16 bits
Medium 32 bits 16 bits
Compact 16 bits 32 bits
Large 32 bits 32 bits
Huge 32 bits 32 bits
────────────────────────────────────────────────────────────────────────────
13.1.8 Address Space
The amount of available memory and the address space on systems varies,
depending on many factors outside your control. A program designed with
portability in mind should handle insufficient-memory situations. To ensure
that your program handles these situations, you should always check the
error return from any of the dynamic memory allocation routines, such as
malloc, calloc, strdup, and realloc.
These situations occur not only because of a lack of installed memory but
also because too many other applications are using memory. For example,
■ Installed resident software can cause your program to fail. In DOS,
these programs are usually device drivers or
terminate-and-stay-resident (TSR) utilities.
■ An event or combination of events in a multitasking operating system
such as OS/2 or XENIX can cause your program to fail. These failures
are complex and difficult to predict. Here is an example: the user has
installed a daemon to "pop up" every so often and check the system
status. The user is running your application along with enough other
large applications to cause a critical shortage of memory. When the
daemon pops up, your program may fail on a memory allocation request.
■ An application running under Windows can use an extraordinary amount
of the global heap and not return it to the free pool. This type of
behavior will cause Windows to deny a GlobalAlloc request.
13.1.9 Character Set
The C language does not define the character set used in an implementation.
This means that any programs that assume the character set to be ASCII are
nonportable.
The only restrictions on the character set are these:
■ No character in the implementation's character set may be larger than
the size of type char.
■ Each character in the set must be represented as a positive value by
type char, whether it is treated as signed or unsigned. So, in the
case of the ASCII character set and an eight-bit char, the maximum
value is 127 (128 is a negative number when stored in a char
variable).
Character Classification
The standard C run-time support contains a complete set of
characterclassification macros and functions. These functions are defined in
the CTYPE.H file and are guaranteed to be portable:
isalnum isdigit isprint isupper
isalpha isgraph ispunct isxdigit
iscntrl islower isspace
The following code fragment is not portable to implementations that do not
use the ASCII character set:
/* Nonportable */
if( c >= 'A' && c <= 'Z' )
/* uppercase alphabetic */
Instead, consider using this:
/* Portable */
if( isalpha(c) && isupper(c) )
/* uppercase alphabetic */
The first example above is nonportable, because it assumes that uppercase A
is represented by a smaller value than uppercase Z, and that no lowercase
characters fall between the values of A and Z. The second example is
portable, because it uses the character classification functions to perform
the tests.
In a portable program, you should not perform any comparison on variables of
type char except strict equality (==). You cannot assume the character set
follows an increasing sequence─that may not be true on a different machine.
Case Translation
Translation of characters from upper- to lowercase or from lowerto uppercase
is called "case translation." The following example shows a coding technique
for case translation not portable to implementations using a non-ASCII
character set.
#define make_upper(c) ((c)&0xcf)
#define make_lower(c) ((c)|0x20)
This code takes advantage of the fact that you can map uppercase to
lowercase simply by changing the state of bit 6. It is extremely efficient
but nonportable. To write portable code, use the case-translation macros
toupper and tolower (defined in CTYPE.H).
13.2 Assumptions about the Compiler
Different compilers translate C source code into object code in different
ways. The ANSI draft standard for the C programming language defines how
many of these translations must be done; others are implementation-defined.
This section describes assumptions about how the compiler translates your C
code, which can make your programs nonportable. For a complete description
of how Microsoft C handles implementation-defined operations, see Appendix
C, "Implementation-Defined Behavior."
13.2.1 Sign Extension
"Sign extension" is the propagation of the sign bit to fill unoccupied space
when promoting to a more-significant type or when performing bitwise
right-shift operations.
Promotion from Shorter Types
Integral promotions from shorter types occur when you make an assignment,
perform arithmetic, perform a comparison, or perform an explicit cast.
The behavior of integral promotion is well defined, except for type char.
The implementation defines whether type char is treated as signed or
unsigned. The code fragment below is an example of promotion as a result of
assignment:
char c1 = -3;
int i1;
i1 = c1;
In this example, the expected result of the assignment statement is that i1
will be set to -3. If the implementation defines type char as unsigned,
however, sign extension will not occur, and i1 will be 253 (on a
two's-complement machine).
Promotion can also occur as a result of a comparison of different types:
char c;
if( c == 0x80 )
.
.
.
This comparison will never evaluate as true on an implementation that
signextends char types but treats hexadecimal constants as unsigned. Use a
character constant of the form '\x80', or explicitly cast the constant to
type char to perform the comparison correctly.
The following comparison, which is an example of promotion as a result of a
cast, is also nonportable:
char c;
unsigned int u;
if( u == (unsigned)c )
There are two problems with this code:
■ The char type may be treated as signed or unsigned, depending on the
implementation.
■ If the char type is treated as signed, it can be converted to unsigned
in two different ways: the char value may first be sign-extended to
int, then converted to unsigned; or the char may be converted to
unsigned char, then sign-extended to int length.
It is always safe to compare a signed int with a char constant because C
requires all character constants to be positive.
Variables of type char are promoted to type int when passed as arguments to
a function. This will cause sign extension on some machines. Consider the
following code:
char c = 128;
printf( "%d\n", c );
Microsoft C Specific
Microsoft C allows you to treat type char as signed or unsigned. By default,
a char is considered signed, but if you change the default char type using
the /J compiler option, you can treat it as unsigned.
Bitwise Right-Shift Operations
Positive or unsigned integral types (char, short, int, and long) yield
positive or zero values after a right bitwise shift (>>) operation. For
example,
(char)120 >> 4
yields 7,
(unsigned char)240 >> 8
yields 0,
(int)500 >> 8
yields 1, and
(unsigned int)65535 >> 4
yields 4,095.
Negative-signed integral types yield implementation-defined values after a
bitwise right-shift operation. This means that you must know whether you
want to do a signed or unsigned shift, then code accordingly.
If you don't know how the implementation performs, you may get unexpected
results. For example, (signed char)0x80 >> 3 yields 0xf0 if the
imple-mentation performs sign extension on right bitwise shifts. If the
implementation does not perform the sign extension, the result is 0x10.
You can use right shifts to speed up division when the divisor can be
represented by powers of 2 and the dividend is positive. To maintain
portability, you should use the division operator.
To perform an unsigned shift, explicitly cast the data to an unsigned type.
To perform a shift that extends the sign bit, use the division operator as
follows: divide by 2n, where n is the number of bits you want to shift.
13.2.2 Length and Case of Identifiers
Some implementations do not support long identifiers. Some allow only 6
characters, while others allow as many as 32. They may report each
identifier that exceeds the maximum length or truncate identifiers to a
given length. Truncation causes serious problems, especially if you have a
number of similarly named variables within the scope of a block of code,
such as the following:
double acct_receivable_30_days;
double acct_receivable_60_days;
double acct_receivable_90_days;
double current_interest_rate;
acct_receivable_30_days *= current_interest_rate;
If your target system retains only six significant characters, you will have
to rename all your acct_receivable variables.
Case sensitivity also affects portability. C is usually a case-sensitive
language. That is, CalculateInterest is not considered the same identifier
as calculateinterest. Some systems are not case sensitive, however, so to
write portable code, differentiate your identifiers by something other than
case.
These problems with identifiers can occur in two locations: the compiler and
the linker or loader. Even if the compiler can handle long and
case-differentiated identifiers, if the linker or loader cannot, you can get
duplicate definitions or other unexpected errors.
Microsoft C Specific
The Microsoft C compiler issues the /NOIGNORECASE command to the Microsoft
Segmented-Executable Linker (LINK), specifically instructing it to consider
the case of identifiers.
13.2.3 Register Variables
The number and type of register variables in a function depend on the
implementation. You can declare more variables as register than the number
of physical registers the implementation uses. In such a case, the compiler
treats the excess register variables as automatic.
Since the types that qualify for register class differ among
implementations, invalid register declarations are treated as automatic.
If you declare variables as register to optimize performance, declare them
in decreasing order of importance to ensure that the compiler allocates a
register to the most important variables.
Microsoft C Specific
The compiler ignores register declarations if you select the global register
allocation optimization. You can select global register allocation as
follows:
Environment Selection
────────────────────────────────────────────────────────────────────────────
CL command line Specify either the /Oe or /Ox option.
PWB Select the Global Register Allocation
option in the Debug Build Options or
Release Build Options dialog boxes.
pragma Use the optimize pragma with the e
parameter.
13.2.4 Functions with a Variable Number of Arguments
Functions that accept a variable number of arguments are not portable.
Although both the ANSI Standard and The C Programming Language specify how
to write these functions and how they behave, differences still exist among
compiler implementors about how to use variable argument lists.
Many UNIX(R) systems support a standard that differs from the ANSI Standard
for variable arguments. Although this may change, it currently presents a
portability concern.
Microsoft C run-time libraries and macros allow you to use whichever version
of variable argument support you expect to be most portable for your
application.
13.2.5 Evaluation Order
The C language does not guarantee the evaluation order of most expressions.
Avoid writing constructs that depend on evaluation within an expression to
proceed in a particular manner. For example,
i = 0;
func( i++, i++ );
.
.
.
func( int a, int b )
{
A compiler could evaluate this code fragment and pass 0 as a and 1 as b.
It could also pass 1 as a and 0 as b and conform equally with the
standards.
The C language does guarantee that an expression will be completely
evaluated at any given "sequence point." A sequence point is a point in the
syntax of the language at which all side effects of an expression or series
of expressions have been completed.
These are the sequence points in the C language:
1. The semicolon (;) statement separator
2. The call to a function after the arguments have been evaluated
3. The end of the first operand of one of the following:
■ Logical AND (&&)
■ Logical OR (||)
■ Conditional (?)
■ Comma separator (,) when used to separate statements or in
expressions; the comma separator is not a sequence point when it
is used between variables in declaration statements or between
parameters in a function invocation
4. The end of a full expression, such as
■ An initializer
■ The expression in an expression statement (for example, any
expression inside parentheses)
■ The controlling expression of a while or do statement
■ Any of the three expressions of a for statement
■ The expression in a return statement
13.2.6 Function and Macro Arguments with Side Effects
Run-time support functions can be implemented either as functions or as
macros. Avoid including expressions with side effects inside function
invocations unless you are sure the function will not be implemented as a
macro. Here is an illustration of how an argument with side effects can
cause problems:
#define limit_number(a) ((a>1000)?1000:(a))
a = limit_number( a++ );
If a ≤ 1000, it is incremented once. If a > 1000, it is incremented twice,
which is probably not the intended behavior.
A macro can be used safely with an argument that has side effects if it
evaluates its parameter only once. You can determine whether a macro is safe
only by inspecting the code.
A common example of a run-time support function that is often implemented as
a macro is toupper. You will find your program's behavior confusing if you
use the following code:
char c;
c = toupper( getc() );
If toupper is implemented as a function, getc will be called only once,
and its return value will be translated to uppercase. However, if toupper
is implemented as a macro, getc will be called once or twice, depending on
whether c is upper- or lowercase. Consider the following macro example:
#define toupper(c) ( (islower(c)) ? _toupper(c) : (c) )
If you include the toupper macro in your code, the preprocessor expands it
as follows:
/* What you wrote */
c = toupper( getc() );
/* Macro expansion */
ch = (islower( (getc()) ) ? _toupper( getc() ) : (getc()) );
The expansion of the macro shows that the argument to toupper will always
be called twice: once to determine if the character is lowercase and the
next time to perform case translation (if necessary). In the example, this
double evaluation calls the getc function twice. Because getc is a
function whose side effect is to read a character from the standard input
device, the example requests two characters from standard input.
13.2.7 Environment Differences
Many programs perform some file I/O. When writing these programs for
portability, consider the following:
■ Do not hard-code file or path names. Use constants you define either
in a header file or at the beginning of the program.
■ Do not assume the use of any particular file system. For example, the
UNIX-model, hierarchical file system is prevalent on small computers.
On larger systems, the file system often follows a different model.
■ Do not assume a particular display size (number of rows and columns).
■ Do not assume that display attributes exist. Some environments do not
support such attributes as color, underlined text, blinking text,
highlighted text, inverse text, protected text, or dim text.
13.3 Portability of Data Files
Data files are rarely portable across different CPUs. Structures, unions,
and arrays have varying internal layout and alignment requirements on
different machines. In addition, byte ordering within words and actual word
length may vary.
The best way to achieve data-file portability is to write and read data
files as one-dimensional character arrays. This procedure prevents alignment
and padding problems if the data are written and read as characters. The
only portability problem you are likely to encounter if you follow this
course is a conflict in character sets; many computers have character-set
conversion utilities.
13.4 Portability Concerns Specific to Microsoft C
Microsoft C offers extensions that let you take advantage of the full
capabilities of the computer. These extensions are not portable to other
compilers or environments. The following list shows keywords specific to
Microsoft C:
_asm○ _far _huge pascal
_based _fastcall _interrupt _pascal
cdecl fortran near _saveregs
_cdecl _fortran _near _segment
_export huge _loadds _segname
far
The Microsoft C Reference contains compatibility information for every
function in the run-time library. Any function or macro that does not have
the ANSI box marked may not be portable to other compilers or computer
systems.
13.5 Microsoft C Byte Ordering
Tables 13.4 and 13.5 summarize Microsoft C byte ordering for short and long
types, respectively. In these tables, the least-significant byte of the data
item is b0; the next byte is denoted by b1, and so on.
Since byte ordering is machine specific, any program that uses this byte
ordering will not be portable.
Table 13.4 Byte Ordering for Short Types
╓┌───────────────────────────┌───────────────────────────────────────────────╖
CPU Byte Order
────────────────────────────────────────────────────────────────────────────
8086 b0 b1
CPU Byte Order
────────────────────────────────────────────────────────────────────────────
8086 b0 b1
80286 b0 b1
PDP-11(R) b0 b1
VAX-11(R) b0 b1
M68000 b1 b0
Z8000(R) b1 b0
────────────────────────────────────────────────────────────────────────────
Table 13.5 Byte Ordering for Long Types
╓┌────────────────────────┌──────────────────────────────────────────────────╖
CPU Byte Order
────────────────────────────────────────────────────────────────────────────
8086 b0 b1 b2 b3
80286 b0 b1 b2 b3
PDP-11 b2 b3 b0 b1
VAX-11 b0 b1 b2 b3
M68000 b3 b2 b1 b0
CPU Byte Order
────────────────────────────────────────────────────────────────────────────
M68000 b3 b2 b1 b0
Z8000 b3 b2 b1 b0
────────────────────────────────────────────────────────────────────────────
PART IV OS/2 Support
────────────────────────────────────────────────────────────────────────────
The Microsoft C Professional Development System provides support for OS/2
development.
Chapter 14 explains many of the general issues of OS/2 development,
including accessing the OS/2 system functions, creating module-definition
files, and using the OS/2-specific features of utilities such as the linker
and BIND. Chapter 15 focuses on how to create a multithread application,
including information about C run-time library support, potential problem
areas, and how to use CodeView to debug multithread applications. Chapter 16
concentrates on the creation of dynamic-link libraries, including C run-time
library support, application program interface with DLLs, and debugging DLLs
with CodeView.
Chapter 14 Building OS/2 Applications
────────────────────────────────────────────────────────────────────────────
Using Microsoft C 6.0, you can create applications for OS/2. This chapter
explains features in the compiler and the utilities that
■ Call the OS/2 operating system directly from C functions
■ Perform multitasking within your program by starting multiple
execution paths known as "threads"
■ Create dynamic-link libraries that can be used by multiple
applications
■ Work in either OS/2 or DOS to create programs for both environments
■ Develop "dual-mode" applications that will run under both OS/2 and DOS
from a single executable program file
This chapter contains information about accessing the OS/2 Applications
Program Interface (API) from your C programs. It also discusses compile
options that affect applications you develop for OS/2, module-definition
files and import libraries, linker options specific to developing OS/2
applications, and using the BIND utility to create dual-mode applications.
Chapters 15 and 16, "Creating Multithread OS/2 Applications" and
"Dynamic-Linking with OS/2," contain detailed information about how
Microsoft C supports these advanced OS/2 features.
14.1 The OS/2 Applications Program Interface
The entire set of OS/2 system calls is known as the OS/2 API. You need to
access the OS/2 API for the low-level functions provided by the operating
system, such as
■ Requests for information about the display
■ Requests to display information
■ Requests for information from the pointing device (mouse)
■ Requests for information from the keyboard
■ Requests for blocks of memory
■ Requests for disk actions, including reading and writing
You can call all of the OS/2 system services directly from programs written
in C. Under DOS, the API operates at a lower level, requiring programs to
set up hardware registers and generate a software interrupt to access the
system services. Under OS/2, programs use function calls to access the
operating system services.
Sections 14.1.1-14.1.3 describe the calling conventions and precautions you
must observe when accessing OS/2 API functions.
14.1.1 Calling the OS/2 API
Your program must declare calls to the OS/2 API with both the _far and
_pascal keywords. Adding the _pascal keyword to the function declaration
ensures that the FORTRAN/Pascal calling convention is used. The _far keyword
directs the compiler to generate an intersegment call instruction. A sample
declaration for the OS/2 API function DosExit follows:
void _far _pascal DosExit( unsigned int, unsigned int );
You must be sure that all pointers passed to OS/2 API functions are far
pointers, even if you are writing a program using the small or medium memory
models. This process can be simplified if you include the OS2.H header file.
OS/2 API function calls are far and must use the FORTRAN/ Pascal calling
convention.
OS/2 API functions use the FORTRAN/Pascal language calling convention. They
expect arguments to be pushed onto the stack in left-to-right order, with
the last argument in the list pushed onto the stack last. OS/2 API functions
remove their arguments from the stack before returning to the caller.
Standard C functions push their arguments from right to left, with the first
argument being the last one pushed.
All OS/2 API functions return 0 if the operation is successful. They return
an error code if the operation fails.
14.1.2 Including the OS/2 Header Files
You do not have to construct your own API declarations if you use the OS2.H
header file. It is the first file of a set of header files that supply
function prototypes for every OS/2 API call and definitions of special OS/2
structures, data types, and constants.
The API function prototypes define all functions as far procedures with the
FORTRAN/Pascal calling convention. They also take care of casting all near
pointers to far pointers and other similar type coercions.
Define a constant before including OS2.H.
When you include OS2.H, the most commonly used data types and macros are
automatically defined. To minimize compile time for the C preprocessor,
other definitions are grouped by function. They are included only if your
source file defines the appropriate constant before including OS2.H. The
following list shows how these manifest constants affect functions from the
OS/2 API:
Constant Effect
────────────────────────────────────────────────────────────────────────────
INCL_BASE All error constants, kernel, keyboard,
video, and mouse definitions (same as
INCL_DOS + INCL_SUB + INCL_DOSERRORS)
INCL_DOS All kernel system definitions
INCL_DOSERRORS All error constants
INCL_KBD All keyboard definitions
INCL_MOU All mouse definitions
INCL_SUB All keyboard, video, and mouse
definitions (same as INCL_KBD + INCL_VIO
+ INCL_MOU)
INCL_VIO All video-display definitions
INCL_WIN Basic set of Presentation Manager
definitions
The header files have additional constants that let you include smaller
subsets or functions not defined in the standard sets.
The statement #define INCL_DOS affects the functions defined.
The program in the example below calls the OS/2 kernel to request a
nonshareable, nondiscardable memory segment for an 8K buffer. The INCL_DOS
constant in the #define statement instructs the C preprocessor to include
all of the kernel function definitions. The function prototype for
DosAllocSeg declares the first and third arguments as USHORT (unsigned short
integers). The second argument is a far pointer to the OS/2 data type SEL,
which is used for segment selectors.
#define INCL_DOS
#include <os2.h>
VOID GetMemorySegment()
{
SEL selector;
if ( DosAllocSeg( 8192, &selector, 0 ) )
puts( "Allocation failed\n" );
else
puts( "Successful allocation\n" );
}
The function call in the example works correctly even in a small or medium
memory model program where the selector variable is a near data type. All
three arguments are coerced by the function prototype to the proper types,
regardless of the memory model used.
14.1.3 Creating Dual-Mode Programs as Family Applications
The OS/2 API has a subset of system functions that have direct DOS
equivalents. This subset is known as the "Family Applications Program
Interface" (Family API). Programs that use only the Family API can be run
under DOS and the OS/2 compatibility box, as well as under OS/2.
You can build a single executable file for use under both OS/2 and DOS.
By creating a Family API application, you can distribute the same executable
file to both OS/2 and DOS users. The Microsoft C compiler, linker, and
object module librarian are examples of family applications. The benefit of
having a single executable file is offset by a few disadvantages:
■ The executable file is larger, because it includes a special loader
and OS/2 API-simulator routines for running in DOS mode.
■ In real mode, the application loads more slowly than a program created
specifically for either OS/2 or DOS. There is no performance penalty
in loading or running in OS/2 protected mode.
■ When running in real mode, the program cannot use advanced OS/2
features such as multiple threads or system calls that are not part of
the Family API. If you take special precautions (described in Section
14.5, "The BIND Utility"), the program can take advantage of these
features when running in OS/2 protected mode.
Follow the same steps to build both family and protected-mode applications
but add an extra step at the end to create the Family API program. This step
links functions from the dynamic-link libraries directly into a stand-alone
executable file that can run in both real and protected mode.
Restrictions on Family Applications
Programs that use the Family API are subject to certain restrictions:
■ They cannot overcommit memory; they must fit into the DOS 640K
environment.
■ They cannot use advanced OS/2 features, such as threads and
semaphores, that do not have DOS counterparts.
■ They must restrict their use of some calls to the defined common
subset. For example, some of the file-mode options for the DosOpen
function are not available in real mode.
Family API Functions
The system calls that make up Family API are listed below. The calls marked
with an asterisk (*) have different options or behavior, depending on
whether they are running in real mode or protected mode. The Microsoft OS/2
Programmer's Reference explains the functions and the differences between
their real- and protected-mode implementations.
DosAllocHuge* DosHoldSignal* DosSubSet
DosAllocSeg* DosInsMessage* DosWrite
DosBeep DosMkDir KbdCharIn*
DosBufReset DosMove KbdFlushBuffer*
DosCaseMap* DosNewSize KbdGetStatus*
DosChdir DosOpen* KbdPeek*
DosChgFilePtr DosPutMessage* KbdSetStatus*
DosCLIAccess DosQCurDir KbdStringIn*
DosClose DosQCurDisk VioGetBuf
DosCreateCSAlias* DosQFHandState VioGetConfig
DosDelete DosQFileInfo VioGetCurPos
DosDevConfig DosQFileMode VioGetCurType
DosDevIOCtl* DosQFSInfo VioGetMode
DosDupHandle DosQHandType VioGetPhysBuf
DosErrClass DosQVerify VioReadCellStr
DosError* DosRead* VioReadCharStr
DosExecPgm* DosReallocHuge* VioScrLock*
DosExit* DosReallocSeg* VioScrollDn
DosFileLocks DosRmDir VioScrollLf
DosFindClose DosSelectDisk VioScrollRt
DosFindFirst DosSetCp VioScrollUp
DosFindNext* DosSetDateTime VioScrUnLock
DosFreeSeg* DosSetFHandState* VioSetCurPos
DosGetCollate* DosSetFileInfo VioSetCurType
DosGetCp DosSetFileMode VioSetMode
DosGetCtryInfo* DosSetFSInfo VioShowBuf
DosGetDateTime DosSetSigHandler* VioWrtCellStr
DosGetDBSCEv* DosSetVec* VioWrtCharStr
DosGetEnv DosSetVerify VioWrtCharStrAtt
DosGetHugeShift DosSizeSeg VioWrtNAttr
DosGetMachineMode DosSleep VioWrtNCell
DosGetMessage* DosSubAlloc VioWrtNChar
DosGetVersion DosSubFree VioWrtTTy
14.2 Compile Options for the CL Command
This section describes the compile options you must specify in the
Programmer's WorkBench or on the CL command line to designate a program's
target environment (OS/2, DOS, or both). It also introduces options you
should use with certain types of OS/2 applications, such as multithread
programs, dynamic-link libraries, and programs calling C function
dynamic-link libraries. For an in-depth discussion of topics that affect
multithread processes and dynamic-link libraries, see Chapter 15, "Creating
Multithread OS/2 Applications," and Chapter 16, "Dynamic-Linking with OS/2."
14.2.1 The Link Mode Options (/Lp, /Lr, and /Lc)
The /Lx options (/Lp, /Lr, and /Lc) provide the flexibility of programming
for both OS/2 and DOS in either environment. Regardless of the host
operating system, you can build applications for either target operating
system. You do not have to switch to the target system to build the program.
The /Lp option produces an OS/2 protected-mode program; the /Lr option
creates a DOS real-mode program. /Lc is a synonym for /Lr.
To use these options, the mode-specific combined libraries must be
installed. Unless you choose a default operating environment, each
mode-specific library has the letter P or R at the end of its base name. For
example, the protected-mode small memory model library with the emulator
floating-point option is named SLIBCEP.LIB. The corresponding real-mode
library is named SLIBCER.LIB. The default name, however, is SLIBCE.LIB.
Installing and Using the Microsoft C Professional Development System
describes how to create mode-specific libraries with the SETUP program. It
also explains how to establish a default target environment by renaming
libraries. A default environment is useful if you work mainly in one mode
(OS/2 or DOS) but sometimes write programs for the other mode. When you set
up OS/2 as the default mode, SLIBCEP.LIB, for example, becomes SLIBCE.LIB.
Don't use /Lx options unless you have mode-specific libraries.
When you use the /Lx options, you instruct the compiler to override the
default library name in the object module's library search record and to
substitute the mode-specific combined library name. The compiler also
generates a link response file with the /NODEFAULTLIBRARYSEARCH (/NOD)
linker option to override the default library. See Section 14.4, "Link
Command-Line Options," for more information about the /NOD option.
Do not use the /Lp option to specify protected mode when OS/2 is the default
environment. If you do this, the compiler uses the name of the mode-specific
library (e.g., SLIBCEP.LIB). Because SETUP renamed the library to SLIBCE.LIB
to create a default environment, the library search fails. This caution also
applies to specifying /Lr when you have installed DOS as the default
environment.
If you invoke the linker in a separate step from the compilation, you must
specify the /NOD link option.
────────────────────────────────────────────────────────────────────────────
NOTE
There is a special library, LLIBCMT, for building multithread OS/2
applications. Another special library, LLIBCDLL, supports multithread
dynamic-link libraries. If you use LLIBCMT or LLIBCDLL, you must use one of
the library selection options described in Section 14.2.3 instead of / Lp.
────────────────────────────────────────────────────────────────────────────
14.2.2 Creating Bound Programs Option (/Fb)
The /Fb option allows you to compile, link, and bind an application in one
step. Binding an executable file creates a Family API program that can run
under both OS/2 and DOS.
When you use /Fb, the compiler invokes the BIND utility program immediately
after the link step. You can also execute BIND directly (as described in
Section 14.5, "The BIND Utility"). You must have the API.LIB and OS2.LIB
files in the path specified by the LIB environment variable or in your
current working directory.
The syntax for the /Fb option is
/Fb«bound-exe»
You can specify a separate name for a bound-executable file.
The optional bound-exe parameter specifies the name of the bound program. It
must directly follow the /Fb option, without intervening spaces. The
bound-exe name can be a file specification, a drive name, or a directory
specification. If you specify a file name without an extension, the compiler
appends the .EXE extension to the name. If you give a directory
specification for bound-exe, the name must end with a backslash ( \ ) so the
compiler can distinguish it from an ordinary file name. If you do not supply
a name, BIND uses the name of the unbound program and overwrites it.
When creating both bound and protected-mode versions with different names,
consider this example:
CL /Lp /Fbsampleb sample.c
The protected-mode executable file that this command creates is called
SAMPLE.EXE; the bound-executable file is called SAMPLEB.EXE.
You may need to run BIND as a separate step instead of using the /Fb
option.
The /Fb option works only if you are doing a single-step compile and link.
If the CL command line includes the /c (compile without link) option, the
compiler ignores the /Fb option. If you use /c, you must run the BIND
utility as a separate step of the program build.
If your program includes calls to API functions that are not in the FAPI
subset, you must use the /n option of the BIND utility, described in Section
14.5, to build the dual-mode executable file. If you need to use the /n BIND
option, you cannot compile with /Fb. You must compile without linking by
using the /c option at the compile stage; then link the program and run the
BIND utility with the /n option.
14.2.3 Library Selection Options (/MT, /ML, /MD, /Zl)
Special libraries are provided for building OS/2 multithread applications
and dynamic-link libraries. You must not use these libraries with any other
C run-time library.
Special libraries must be the only C run-time libraries linked with your
program.
If you use one of these special libraries, apply one of the library
selection options (/ML, /MD, or /MT) to tell the compiler to replace the
default library name in the object file with the name of the special
library. This ensures that the linker does not bring in code from the
default libraries. If you do not specify one of the options when compiling,
you must link with the /NOD option to prevent search of a default library,
such as SLIBCE.LIB.
If you fail to include any of these options, the linker searches the default
library and may select the wrong version of a library function. It might,
for example, select the single thread version of the printf function for a
multithread program that has more than one thread calling printf.
Because the /Lp option (see Section 14.2.1, "The Link Mode Options")
instructs the compiler to specify the default protected-mode libraries
rather than the special multithread or DLL-specific libraries, do not use it
with /Zl or /Mx.
Multithread Library Option (/MT)
When you specify the /MT option, the compiler embeds the LLIBCMT.LIB
library name in the object file. Chapter 15, "Creating Multithread OS/2
Applications," explains how to build multithread applications using
LLIBCMT.LIB. The /MT option also has the effect of combining these
command-line options:
/ALw /FPi /G2 /D MT
C Run-Time Library for Building DLLs (/ML)
Use the /ML option to specify that you are building a dynamic-link library
that calls functions in LLIBCDLL.LIB, the C run-time library for
dynamic-link libraries. The library name is embedded in the object file.
The /ML option also has the effect of combining these command-line options:
/ALw /FPa /G2 /D MT
C Run-Time Library for DLLs (/MD)
Use the /MD option to create a dynamic-link library of C run-time routines.
With this option, the object file does not have any library search records.
The /MD option has the effect of combining these command-line options:
/ALw /FPi /G2 /DDLL /D MT
Chapter 16, "Dynamic Linking with OS/2," describes the process of building
and using dynamic-link libraries with LLIBCDLL.LIB.
Suppress Default Library Option (/Zl)
Use the /Zl option when you want to suppress selection of a default library.
It tells the compiler not to place the default library name in the object
file.
You can specify libraries and additional LINK options on the CL command
line.
You can specify link options or the names of libraries on the CL command
line with the /LINK option. You can also give the library name, with its
.LIB extension, before the /LINK option. Each command below selects the
multithread C run-time library:
CL /Zl myprog.c llibcmt.lib
CL /Zl myprog.c /link llibcmt
If you compile with the /c (compile without link) option, your link command
must include the library name:
LINK myprog, myprog.exe, myprog.map, llibcmt.lib, myprog.def
14.2.4 Memory-Model Options (/Ax)
You must select the memory model appropriate to your application. For
protected-mode applications, the large model provides the most convenient
interface with the special libraries. It provides the additional benefit of
placing code and data into multiple segments, allowing OS/2 to swap parts of
the program to disk efficiently.
Use the large memory model with LLIBCMT (/AL and /MT).
The multithread run-time C library, LLIBCMT.LIB, is a large-model library.
All library function calls must be far calls. In addition, all pointers
passed to functions in the library must be far pointers. If you do not
compile with the /AL option, you use must use the keyword _far when
declaring pointers. Variables can be declared either near or far as long as
they are either passed by value or cast to a far address.
If you want to call fopen for example, you must use code such as the
following:
FILE _far * fp;
fp = fopen( ... );
────────────────────────────────────────────────────────────────────────────
NOTE
If you are using the compact, large, or huge memory model, data pointers are
far by default, so you do not need to explicitly specify _far.
────────────────────────────────────────────────────────────────────────────
Because each thread has its own stack, you have to compile in an SS != DS
model.
Multithread applications require that each thread have its own stack. As a
result, you cannot safely assume that the stack segment is in the default
data group (DGROUP). That means that the stack segment can be different from
the data segment (SS != DS).
To specify that you have selected an SS != DS model, you must use the /Au or
/Aw option. The /MT option is a shorthand way of specifying this combination
of options to the compiler:
/ALw/FPi/G2/DMT
The /MT option also causes the compiler to place a library search record for
LLIBCMT in the object file.
14.3 Module-Definition Files and Import Libraries
A module-definition file tells the linker about the characteristics of an
application or dynamic-link library. It describes names, segments, memory
requirements, and import and export definitions. Export definitions make
functions in the OS/2 dynamic-link libraries (DLLs) available to other
programs. Each export definition specifies a function name. A program using
these functions must have import definitions in order to find each
dynamic-link function. Each import definition specifies a function name and
the name of the dynamic-link library where the function resides.
The IMPLIB utility generates a library of import definitions that can be
examined during the link. For imported functions, the import library can be
used in place of a module-definition file.
Module-definition files are optional for most OS/2 programs. Two types of
programs must use them:
■ Dynamic-link libraries
■ Programs with I/O privileges
Each module-definition file contains one or more module statements defining
attributes of the executable program. The statements and their associated
attributes are listed below:
Statement Attribute
────────────────────────────────────────────────────────────────────────────
CODE Gives default attributes for code
segments
DATA Gives default attributes for data
segments
DESCRIPTION Describes the module in one line
EXETYPE Identifies the operating system
EXPORTS Defines exported functions
HEAPSIZE Specifies local heap size, in bytes
IMPORTS Defines imported functions
LIBRARY Names a dynamic-link library
NAME Names an application
OLD Preserves import information from a
previous version of the library
PROTMODE Specifies that the module runs only in
OS/2 protected mode
REALMODE Relaxes some restrictions that the
linker imposes for protected-mode
programs
SEGMENTS Gives attributes for specific segments
STACKSIZE Specifies local stack size, in bytes
STUB Adds a DOS 3.x executable file to the
beginning of the module, usually to
terminate the program when run in real
mode
In addition to the keywords listed above, each statement includes one or
more fields to complete the attribute description. All keywords must be
entered in uppercase. You can include comments in the module-definition file
by beginning the line with a semicolon (;). For a complete list of the
keywords and their meaning, see on-line help for information about
module-definition files.
14.3.1 Adding a Module-Definition File to the LINK Command
The module-definition file name is the last field of the link command:
LINK objects «,«exe» » «, «map» » «, «lib» » «, «def» » «;»
This example uses the default libraries:
LINK sample, sample.exe, sample.map,,sample.def
When you use a module-definition file, you must use the /c option on the CL
command line and link in a separate step. If you are linking without a
module-definition file, you can use a semicolon after your last entry to
suppress LINK's prompt for the module-definition file name and other missing
parameters.
The segmented-executable linker is the only LINK program that recognizes
module-definition files. Since it is backwards compatible, it should be the
only linker in your path. The QuickC linker does not process these files.
The following sections illustrate ways to use module-definition files.
On-line help describes all of the commands and options available.
14.3.2 Creating Dynamic-Link Libraries (DLLs)
You can build your own dynamic-link libraries. A simple module-definition
file for such a library with one public function is shown below:
LIBRARY Mylib INITINSTANCE
DATA MULTIPLE
EXPORTS
MyProc
You can use the same module-definition file you used to create the
dynamic-link library as input to the IMPLIB utility. IMPLIB generates a
library file with a .LIB extension for use by applications calling your
dynamic-link routines. Section 14.3.5 describes the IMPLIB program. Chapter
16, "Dynamic Linking with OS/2," explains how to build a dynamic-link
library.
The LIBRARY statement tells the linker that this is a dynamic-link library
rather than an application. (Applications use the NAME statement instead of
the LIBRARY statement.)
The EXPORTS statement gives the name of the public function.
You can designate exported functions in a C source file.
The C language keyword _export is an alternative to the EXPORTS statement.
When _export appears in a function declaration or definition, the compiler
puts the function and its parameter size in the object module's export
record. Functions with the _export keyword that are not listed in the
module-definition file cannot have input/output privileges or alias names.
Using generic library names is dangerous.
Since OS/2 systems have many dynamic-link libraries installed, try to pick a
name that uniquely identifies your library. If you choose a generic name,
such as CRT.DLL or WINDOWS.DLL, you run the risk of having your library
overwritten by someone else's dynamic-link library with the same name.
14.3.3 Creating Programs with I/O Privileges
OS/2 programs that must access hardware directly can designate a code
segment with input/output privileges. This segment can then perform a
limited set of I/O instructions but cannot make any calls to dynamic-link
libraries.
You cannot use the C run-time library functions inp and outp for input and
output. Their use is limited to real-mode programs. You can, however, use
in-line assembler code in your C source program to access a port.
The sample module-definition file below shows two segments for a program:
NAME IOPROG
EXETYPE OS/2
SEGMENTS
_IOSEG IOPL
_TEXT NOIOPL
EXPORTS
CharIn 4
CharOut 4
The first code segment contains the I/O portion of the program and has the
IOPL keyword. The second segment is designated NOIOPL (the default).
The EXPORT statement for IOPL functions must include parameter size.
The EXPORTS section names two functions in the IOPL segment that can be
called by procedures outside the segment. It also specifies the size of the
function's parameters. Procedures with I/O privileges must specify the
number of words needed for their parameters.
────────────────────────────────────────────────────────────────────────────
NOTE
Unless the user has specified IOPL=YES in the CONFIG.SYS file, the program
will not load.
────────────────────────────────────────────────────────────────────────────
14.3.4 Creating Presentation Manager Applications
The Presentation Manager calls window and dialog procedures inside a
Presentation Manager application. The sample module-definition file below
exports these procedures and gives the linker additional instructions for
building the program. Module-definition files are optional for Presentation
Manager applications. They can be used to control the way different segments
of the program are loaded.
NAME PMSAMPLE WINDOWAPI
EXETYPE OS/2
STACKSIZE 4096
SEGMENTS
_INIT PRELOAD
_HELP LOADONCALL
_TEXT LOADONCALL
In the preceding example, the NAME statement identifies the program as an
application named PMSAMPLE. The WINDOWAPI keyword tells the linker to mark
the executable file as a Presentation Manager application. Only programs
marked as windows applications or windows-compatible applications can share
the Presentation Manager screen group.
The EXETYPE statement tells the linker to build a program that runs only in
protected mode and to produce the optimal executable file for OS/2.
The STACKSIZE statement allocates 4096 bytes of local stack space. This is
the minimum stack size recommended for Presentation Manager programs.
You can reduce run-time memory requirements.
The SEGMENTS statement controls the way code and data segments are handled.
By default, segments are not brought into physical memory until needed. The
PRELOAD keyword in the example tells the system loader to load the _INIT
segment when the program starts. The _TEXT and _HELP segments are loaded
on demand. You can use the compiler's /NT option to generate your own
segment names, such as _INIT and _HELP. Separate segments are useful for
code that is executed infrequently, such as a help subsystem. This reduces
the amount of run-time memory required for your application, since each
segment will be loaded when and if there is a request for it.
14.3.5 Creating Import Libraries with the IMPLIB Utility
Applications that call dynamic-link library functions must use import
definitions that specify the location of each dynamic-link function. The
definitions consist of a function name and the name of the dynamic-link
library file where it resides.
Although the application can use a module-definition file to create the
import definitions, it is easier to use import libraries built by the IMPLIB
utility.
IMPLIB creates an import library in the form of a file with a .LIB
extension, which is read by the linker. At link time, the .LIB file is
specified in the LINK command line, along with other libraries.
IMPLIB accepts two types of sources:
■ The module-definition file used to create the dynamic-link library
■ The dynamic-link library itself
The IMPLIB command has the syntax:
IMPLIB «/c»libfile deffile «deffile ...»
or
IMPLIB «/c»libfile dynlib «dynlib ...»
The /c option directs IMPLIB to be case sensitive. By default, it is case
insensitive.
The libfile field names the new import library file. The deffile or dynlib
fields name the input files, which are dynamic-link library or
module-definition files.
The following example creates the import library file named MYLIB.LIB from
the MYLIB.DLL dynamic-link library:
IMPLIB mylib.lib mylib.dll
For more information about import libraries and IMPLIB, consult on-line
help.
14.4 Link Command-Line Options
This section describes command-line options that control various aspects of
the linker and the circumstances in which you will need to use them.
/NODEFAULTLIBRARYSEARCH (/NOD)
If you did not compile with /MT, /MD, or /ML, suppress default library
searching.
The /NODEFAULTLIBRARYSEARCH option prevents the linker from searching any
library specified in an object file. When you specify this option, you
should also specify the name of the library to be linked. The minimum
abbreviation for this option is /NOD.
If you are using the multithread library, LLIBCMT, or the dynamic-link
library, LLIBCDLL, you should use this option. Use it with dynamic-link
libraries built with LLIBCDLL. This is mandatory if you did not compile with
the /Zl, /MT, or /ML options.
You can select a specific library by appending the library name to the /NOD
option, as in
/NOD:LLIBCMT.LIB
/NOEXTENDEDDICTSEARCH (/NOE)
The /NOEXTENDEDDICTSEARCH option prevents the linker from searching the
extended dictionary, which is an internal list of symbol locations
maintained by the linker. You need to use this option if a library symbol
(such as _setargv, _binmode, or _varstck) is redefined and you receive error
L2044 from the linker. The minimum abbreviation for this option is /NOE.
/NOIGNORECASE (/NOI)
The /NOIGNORECASE option preserves case sensitivity. By default, LINK maps
all names to uppercase characters. Because many C function names are a mix
of upper- and lowercase letters, it is important to use this option. The
compile option /Zc causes any name declared with the _pascal keyword to be
treated without regard to case at the source level. The minimum abbreviation
is /NOI.
/PMTYPE
The /PMTYPE option is an alternative to specifying Presentation Manager
compatibility with the NAME statement of a module-definition file. Use the
following syntax:
/PMTYPE:type
Type must be one of the following:
Type Effect
────────────────────────────────────────────────────────────────────────────
PM The application is an OS/2 Presentation
Manager application using the
Presentation Manager API and running in
the Presentation Manager screen group.
This type corresponds to specifying
WINDOWAPI in the NAME statement of a
module-definition file.
VIO The application is compatible with the
OS/2 Presentation Manager and can run in
a window or in a
separate screen group. This type
corresponds to specifying WINDOWCOMPAT
in the NAME statement of a
module-definition file.
NOVIO The application is not compatible with
the OS/2
Presentation Manager. It must run in a
separate
screen group. This type corresponds to
specifying
NOTWINDOWCOMPAT in the NAME statement of
a module-definition file.
14.5 The BIND Utility
The BIND utility converts a protected-mode program into a program that runs
in both OS/2 and DOS environments. It replaces Family API calls to
dynamic-link library functions with DOS emulator routines from the API.LIB
library. (See Section 14.1.3, "Creating Dual-Mode Programs as Family
Applications," for a list of Family API calls.) BIND produces a stand-alone
program file that can run under
■ OS/2 protected mode
■ OS/2 real mode
■ DOS 2.x and DOS 3.x
BIND is an alternative to the C compiler's /Fb option described in Section
14.2.2, "Creating Bound Programs Option." You must use BIND instead of the
/Fb option when you compile with the /c (compile without link) option or
when your program includes functions that operate only in protected mode.
You can include functions in a bound application that are not members of
the Family API.
To include functions available only in protected mode, you must run the BIND
utility with the /n option. Your run-time code must call the Family API
function DosGetMachineMode to determine whether it is running in real or
protected mode. When your program executes in real mode, it will be aborted
if it tries to call a function available only in protected mode.
You might choose to design your application so it executes different
sections of code, depending on the machine mode. For example, the
application may need to keep track of the passage of elapsed time or to
detect time-outs. In real mode, you might use polling or timing loops or
perhaps intercept the timer interrupts. In protected mode, you should use
the OS/2 semaphore and timer services, such as DosSetSem and DosTimerAsync,
instead.
Invoke BIND with the following syntax:
BIND infile «implibs» «linklibs» «/o outfile» «/n @file» «/n names»
«/m mapfile»
The /n option provides a way to include protected-mode functions. It has two
formats:
■ A list of one or more names, separated by spaces.
■ The name of a file, preceded by the at (@) sign. The file should
consist of a list of functions, one name per line.
The /o option specifies a name for the bound-executable file. If it is not
present, the name of the input file is used.
The /m option causes a link map to be generated for the real-mode version of
the executable file.
To bind a program named TIMER that uses DosTimerAsync to manage time-outs
when running in protected mode, invoke BIND as follows:
BIND TIMER /n DosTimerAsync
For more information about BIND and other command-line options, consult
on-line help.
Chapter 15 Creating Multithread OS/2 Applications
────────────────────────────────────────────────────────────────────────────
Microsoft C, version 6.0, provides support for creating multithread
applications under OS/2. You should consider using more than one thread if
your application needs to manage multiple activities, such as simultaneous
keyboard and mouse input. One thread can process keyboard input while a
second thread filters mouse activities. A third thread could update the
display screen based on data from the mouse and keyboard threads. At the
same time, other threads can access disk files or get data from a
communications port.
This chapter explains the features in C 6.0 that support the creation of
multithread programs. It also describes some important ways in which
programming for OS/2 is different than programming for DOS.
15.1 Multithread Programs
OS/2 performs the scheduling and allocation of real hardware resources to
multiple programs, or "processes." It does not actually schedule the
processes themselves; it schedules threads belonging to the processes.
A thread is basically a path of execution through a program. It is also the
smallest unit of execution that OS/2 schedules. A thread consists of a
stack, the state of the CPU registers, and an entry in the execution list of
the system scheduler. Each thread shares all of the process's resources.
A process consists of one or more threads and the code, data, and other
resources of a program in memory. Typical program resources are open files,
semaphores, and dynamically allocated memory. A program executes when the
system scheduler gives one of its threads execution control. The scheduler
determines which threads should run and when they should run. Threads of
lower priority may have to wait while higher priority threads complete their
tasks.
Threads operate independently and are unaware of other threads.
All threads in a process operate independently of one another. Unless you
take special steps to make them visible to each other, each thread executes
while completely unaware of the existence of other threads in a process.
Threads sharing common resources, however, must coordinate their work by
using flags, semaphores or some other method of interprocess communication.
See Section 15.3, "Writing a Multithread Program," for more information
about synchronizing threads.
15.1.1 Library Support
All shared functions in a multithread program must be re-entrant.
If one thread is suspended by the OS/2 scheduler while executing the printf
function, one of the program's other threads might start executing. If the
second thread also calls printf, data might be corrupted. To avoid this,
access to static data used by the function must be restricted to one thread
at a time. This process of restricting access to certain data is called
serialization.
You do not need to serialize access to stack-based (automatic) variables
because each thread has a different stack. Therefore, a function that uses
only automatic (stack) variables is re-entrant. The standard C run-time
libraries, such as SLIBCE, have a limited number of re-entrant functions. A
multithread program needing to use C run-time library functions that are
normally not re-entrant should be built with the multithread library
LLIBCMT.LIB.
The Multithread C Library LLIBCMT.LIB
The support library LLIBCMT.LIB is a re-entrant large-model library for
cre-ating multithread programs.
A multithread program linked with LLIBCMT.LIB can use any memory model.
All calls to library functions must use the large-model calling interface
(far code pointers, far calls, and far data pointers). When your application
calls functions in this library,
■ All library calls must be far calls.
■ All library calls must use the C calling convention; programs compiled
using the /Gr (fastcall calling convention) or /Gc (Pascal calling
convention) options must use the standard include files for the
run-time library functions they call.
■ All data and code pointers must be far pointers.
■ Variables passed to library functions must either be passed by value
or cast to a far address.
■ Your main function must be declared far if you are compiling with the
small or compact memory models.
You do not need to explicitly declare far pointers if you are using the
compact, large, or huge memory models, since these models use far pointers
as default. For the large and huge memory models, the function calls are
also far by default.
A small-model program calling a library function such as isupper, for
example, must use declarations like the following:
int _far _cdecl isupper( int _c );
Programs built with LLIBCMT.LIB are entirely self-contained.
Programs built with LLIBCMT.LIB do not share C run-time library code or data
with any dynamic-link libraries they call. Chapter 16 explains how to build
DLLs and how to share code and data between processes.
Alternatives to LLIBCMT.LIB
If you choose to build a multithread program without using LLIBCMT.LIB, you
must do the following:
■ Use the standard C libraries and limit library calls to the set of
re-entrant functions.
■ Use the OS/2 API thread management functions, such as DosCreateThread.
■ Provide your own synchronization for functions that are not re-entrant
by using OS/2 services such as semaphores and the DosEnterCritSec and
DosExitCritSec functions.
The C run-time library functions listed below are re-entrant and can be used
in multithread programs linked with the standard libraries.
abs
atoi
atol
bsearch
chdir
getpid
halloc
hfree
itoa
labs
lfind
lsearch
memccpy
memchr
memcmp
memcpy
memicmp
memmove
memset
mkdir
movedata
putch
rmdir
segread
strcat
strchr
strcmp
strcmpi
strcpy
stricmp
strlen
strlwr
strncat
strncmp
strncpy
strnicmp
strnset
strrchr
strrev
strset
strstr
strupr
swab
tolower
toupper
──────────────────────────────────────────
WARNING
The multithread library LLIBCMT.LIB includes the _beginthread and _endthread
functions. The _beginthread function performs initialization without which
many C run-time functions will fail. You must use _beginthread instead of
DosCreateThread in C programs built with LLIBCMT.LIB if you intend to call C
run-time functions.
────────────────────────────────────────────────────────────────────────────
The Multithread Library Compile Option (/MT)
The /MT option for the CL command is the best way to build a multithread
program with LLIBCMT.LIB. The /MT option embeds the LLIBCMT library name in
the object file. Using the /MT option automatically specifies the /ALw /FPi
/G2 /D MT options. The following list describes what these options do.
Switch Effect
────────────────────────────────────────────────────────────────────────────
/ALw Use the large memory model with separate
stack segment; do not reload the DS
register as part of the entry sequence
for every function
/FPi Generate in-line floating-point
instructions and select the emulator
math package
/G2 Use the 80286 processor instruction set
/D MT Use the multithread version of the
include files
These options can be combined with other options to specify different memory
models and different relationships between the data segment and the stack.
You can override the /G2 and /FPi options by specifying a different option
later on the command line. The following example shows how to override the
floating-point package option:
CL /MT /FPa /Lp PROG.C
────────────────────────────────────────────────────────────────────────────
NOTE
You cannot replace the /MT option with /ALw /FPi /G2. You must use /MT to
generate multithread programs.
────────────────────────────────────────────────────────────────────────────
15.1.2 Include Files
The Microsoft C 6.0 include files contain conditional sections for
multithread applications using LLIBCMT.LIB. To compile your application with
the appropriate definitions, you can
■ Compile with the /MT option described in Section 15.1.1, "Library
Support."
■ Define the symbolic constant MT in your source file or on the command
line with the /D option.
Always use the standard include files.
Standard include files declare C run-time library functions as they are
implemented in the libraries. If you used the Maximum Optimization (/Ox) or
Register Calling Convention (/Gr) option, the compiler assumes that all
functions should be called using the register calling convention. The
run-time library functions were compiled using either the C or the
FORTRAN/Pascal calling convention, and the declarations in the standard
include files tell the compiler to generate correct external references to
these functions.
See Section 15.4, "Compiling and Linking," for examples of how to use the MT
constant.
15.1.3 C Run-Time Library Functions for Thread Control
All OS/2 programs have at least one thread. Any thread can create additional
threads. A thread can complete its work very quickly and then terminate, or
it can stay active for the life of the program.
The LLIBCMT and LLIBCDLL C run-time libraries provide two functions for
thread creation and termination: the _beginthread and _endthread functions.
They also declare the global variable _threadid, which contains the address
of an application's current thread identifier.
The _beginthread function creates a new thread and returns a thread
identifier if the operation is successful. The thread will terminate
automatically if it completes execution, or it can terminate itself with a
call to _endthread.
The global variable _threadid holds the address of the identifier of the
current thread. It is defined in the STDDEF.H file as shown below:
/* define pointer to thread id value */
extern int far * _threadid;
────────────────────────────────────────────────────────────────────────────
WARNING
If you are going to call C run-time routines from a program built with
LLIBCMT.LIB, you must start your threads with the _beginthread function. Do
not use the OS/2 functions DosExit and DosCreateThread. Using
DosSuspendThread can lead to a deadlock condition when more than one thread
is blocked waiting for the suspended thread to complete its access to a C
run-time data structure.
────────────────────────────────────────────────────────────────────────────
The _beginthread and _endthread functions are described in detail below.
Section 15.2 illustrates their use in a sample multithread program.
The _beginthread Function
All threads in a process can execute concurrently.
The _beginthread function creates a new thread. A thread shares the code and
data segments of a process with other threads in the process but has its own
unique register values, stack space, and current instruction address. The
system gives CPU time to each thread, so that all threads in a process can
execute concurrently. You can find a complete description of _beginthread
and its arguments in on-line help.
The _beginthread function is similar to the DosCreateThread function in the
OS/2 API with these differences:
■ The _beginthread function lets you pass arguments to the thread.
■ The stack address points to the bottom of the stack. It is the address
of the start of an array or of the start of a block of dynamically
allocated memory. When you use the DosCreateThread call, the stack
address points to the top of the stack.
■ If you specify NULL for the stack address, _beginthread manages
allocation and deallocation of the thread stack for you. This option
is advantageous because it is difficult for your program to determine
when a thread has terminated, so you cannot know when to deallocate
the thread stack. However, _beginthread maintains enough information
to know when a thread has terminated and deallocates the thread's
stack the next time its thread ID is used.
The _beginthread function returns the thread ID number of the new thread if
successful or -1 if there was an error. Errors include specifying an
odd-address stack or an odd- or zero-length stack (which is different than
passing NULL for the stack address) or trying to create too many threads.
The multithread library, LLIBCMT.LIB, supports the maximum number of threads
allowed by OS/2.
The _endthread Function
The _endthread function terminates a thread created by _beginthread. Threads
terminate automatically when they complete. The _endthread function is
useful for conditional termination from within a thread. A thread dedicated
to communications processing, for example, can quit if it is unable to get
control of the communications port. You can find a complete description of
_endthread in on-line help.
15.2 Sample Multithread C Program
BOUNCE.C is a sample multithread program that creates a new thread each time
the letter `a' or `A' is entered at the keyboard. Each thread bounces a
"happy face" of a different color around the screen. Up to 32 threads can be
created. The program's normal termination occurs when `q' or `Q' is entered.
It will also terminate if it receives the CTRL+C or CTRL+BREAK signals. See
Section 15.4, "Compiling and Linking," for details on compiling and linking
BOUNCE.C.
/* Bounce - Creates a new thread each time the letter 'a'is typed.
* Each thread bounces a happy face of a different color around the
screen.
* All threads are terminated when the letter 'q' is entered or when
* the CTRL+C/CTRL+BREAK signals are received.
*
* This program requires the multithread library. For example, compile
* with the following command line:
* CL /MT BOUNCE.C
*/
#define INCL_NOCOMMON /* Use only what we need */
#define INCL_NOPM /* Don't need PM */
#define INCL_DOSPROCESS /* DosBeep and DosSleep */
#define INCL_DOSSEMAPHORES /* OS/2 semaphore functions */
#define INCL_DOSSIGNALS /* OS/2 signal functions */
#define INCL_VIO
#define INCL_KBD
#include <os2.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <process.h>
#define STACK_SIZE 4096
#define MAX_THREADS 32
void main( void ); /* Thread 1: main */
void KbdThread( void ); /* Thread 2: keyboard input */
void BounceProc( char * MyID ); /* Threads 3 to n: display */
void VioClrScr( void ); /* Screen clear */
void ShutDown( void ); /* Program shutdown */
void VioWrtCStr( char *pchString, /* Write string to display */
unsigned usRow, unsigned usColumn );
void pascal far SigHandler( unsigned SigArg,/* Signal handler */
unsigned SigNum );
/* Screen clear macro */
#define VioClrScr() VioScrollDn( 0, 0, 50, 80, 50, BlankCell, 0 )
struct tagCoords /* Display coordinates */
{
int xLoc;
int yLoc;
int xInc;
int yInc;
};
unsigned long RunFlag = 0; /* "Keep Running" semaphore */
unsigned long ScreenLock = 0; /* Screen update semaphore */
char BlankCell[2] = { 0x20, 0x07 };
VIOMODEINFO vmi = { sizeof( VIOMODEINFO ) };/* Mode information */
PFNSIGHANDLER PrevHandler; /* for SetSigHandler call */
unsigned int PrevAction; /* for SetSigHandler call */
void main() /* Thread One */
{
/* Get display screen's text row and column sizes & clear the
screen.*/
VioGetMode( &vmi, 0 );
VioClrScr();
VioWrtCStr( "Threads running: 00. Press 'a' to start another thread",
vmi.row - 1, 0 );
/* Set the "we are running" semaphore. */
DosSemSet( &RunFlag );
/* Start keyboard thread. Let _beginthread allocate memory
* for the thread's stack.
*/
_beginthread( KbdThread, NULL, STACK_SIZE, NULL );
/* Install signal handler for CTRL+BREAK & CRTL+C. */
DosSetSigHandler( (PFNSIGHANDLER)SigHandler, &PrevHandler,
&PrevAction,
SIGA_ACCEPT, SIG_CTRLC );
/* Wait for "running" semaphore to clear (from signal or 'q' key). */
DosSemWait( &RunFlag, SEM_INDEFINITE_WAIT );
_endthread(); /* Kill all threads */
}
void pascal far SigHandler( unsigned int SigArg, unsigned int SigNum )
{
static char BreakMsg[] = "Signal Termination";
ShutDown();
VioWrtCStr( BreakMsg, vmi.row - 1, 0 );
/* Restore original signal handler for CTRL+BREAK & CRTL+C. */
DosSetSigHandler( (PFNSIGHANDLER)PrevHandler, &PrevHandler,
&PrevAction,
PrevAction, SIG_CTRLC );
}
void ShutDown( void ) /* Clean up display when done
*/
{
/* Lock out screen updates from BounceProc & clear "running" semaphore
*/
DosSemWait( &ScreenLock, SEM_INDEFINITE_WAIT );
DosSemSet( &ScreenLock );
VioClrScr();
DosSemClear( &RunFlag );
}
void KbdThread( void ) /* Thread Two: keyboard */
{
KBDKEYINFO KeyInfo; /* for KbdCharIn call */
char ThreadNr = 0;
char NThreadMsg[4];
do
{
/* Block this thread by waiting for keyboard input. */
KbdCharIn( &KeyInfo, IO_WAIT, 0 );
if( tolower( KeyInfo.chChar ) == 'a' && ThreadNr < MAX_THREADS)
{
ThreadNr++;
_beginthread( BounceProc, NULL, STACK_SIZE, &ThreadNr );
VioWrtCharStr( NThreadMsg, sprintf( NThreadMsg, "%02d",
ThreadNr ),
vmi.row - 1, 17, 0 );
}
} while( tolower( KeyInfo.chChar ) != 'q' );
ShutDown();
}
/* getrandom returns a random number between min and max, which must be in
* integer range.
*/
#define getrandom( min, max ) ((rand() % (int)(((max) + 1) - (min))) +
(min))
void BounceProc( char * MyID ) /* Threads Three to n */
{
int xOld, yOld;
char MyCell[2];
char CurrentCell[2];
int CellLen = 2;
struct tagCoords Coords;
/* Generate update increments and initial display coordinates. */
srand( (unsigned) *MyID * 3 );
Coords.xLoc = getrandom( 0, vmi.col - 1 );
Coords.yLoc = getrandom( 0, vmi.row - 1 );
Coords.xInc = getrandom( -3, 3 );
Coords.yInc = getrandom( -3, 3 );
/* Set up "happy face" & generate color attribute from thread
number.*/
if( *MyID > 16)
MyCell[0] = 0x01; /* outline face */
else
MyCell[0] = 0x02; /* solid face */
MyCell[1] = *MyID & 0x0F; /* force black background */
for( ;; )
{
/* Wait for display to be available, then lock it. */
DosSemWait( &ScreenLock, SEM_INDEFINITE_WAIT );
DosSemSet( &ScreenLock );
/* If we still occupy the old screen position, blank it out. */
VioReadCellStr( CurrentCell, &CellLen, yOld, xOld, 0 );
if ( CurrentCell[0] == MyCell[0] && CurrentCell[1] == MyCell[1] )
VioWrtCellStr( BlankCell, CellLen, yOld, xOld, 0 );
/* Draw new face, then clear screen lock */
VioWrtCellStr( MyCell, CellLen, Coords.yLoc, Coords.xLoc, 0 );
DosSemClear( &ScreenLock );
/* Increment the coordinates for next placement of the block. */
xOld = Coords.xLoc;
yOld = Coords.yLoc;
Coords.xLoc += Coords.xInc;
Coords.yLoc += Coords.yInc;
/* If we are about to go off the screen, reverse direction */
if( Coords.xLoc < 0 || Coords.xLoc >= vmi.col )
{
Coords.xInc = -Coords.xInc;
DosBeep( 400, 50 );
}
if( Coords.yLoc < 0 || Coords.yLoc >= vmi.row )
{
Coords.yInc = -Coords.yInc;
DosBeep( 600, 50 );
}
/* Sleep to slow down screen update rate */
DosSleep( 75L );
}
}
void VioWrtCStr( char *pchString, unsigned usRow, unsigned usColumn )
{
VioWrtCharStr( pchString, strlen( pchString ), usRow, usColumn, 0 );
}
15.3 Writing a Multithread Program
When you write a program with multiple threads, you must coordinate their
behavior and use of the program's resources. You must also make sure that
each thread receives its own stack.
Sharing Common Resources
Each thread has its own stack and its own copy of the CPU registers. Other
resources, such as files, static data, and heap memory, are shared by all
threads in the process. Threads using these common resources must coordinate
their work. OS/2 provides semaphores and the DosEnterCritSec and
DosExitCritSec system services for synchronizing resources.
Your program must provide for resource conflicts.
When multiple threads are accessing static data, your program must provide
for possible resource conflicts. Consider a program where one thread updates
a static data structure containing x,y coordinates for items to be displayed
by another thread. If the update thread alters the x coordinate and is
preempted before it can change the y coordinate, the display thread may be
scheduled before the y coordinate is updated. The item would be displayed at
the wrong location. You can avoid this type of problem by using semaphores
to control access to the structure.
Using semaphores is a way of communicating among threads or processes that
are executing asynchronously of one another. This communication is usually
used to coordinate the activities of multiple threads or processes,
typically by controlling access to a shared resource by "locking" and
"unlocking" the resource. To solve the x,y coordinate update problem
described above, the update thread would set a semaphore indicating that the
data structure is in use before performing the update. It would then clear
the semaphore when both coordinates had been processed. The display thread
must wait for the semaphore to be clear before updating the display. This
process of waiting for a semaphore is often called "blocking" on a semaphore
because the process is blocked and cannot continue until the semaphore
clears.
RAM semaphores are faster than system semaphores.
OS/2 supports two types of semaphores: system and RAM semaphores. You must
use a system semaphore if more than one process needs to access the
semaphore. You can use the much faster RAM semaphores if their use is
confined to the threads within a process.
The BOUNCE.C program in Section 15.2 uses a RAM semaphore named ScreenLock
to coordinate screen updates. Each time one of the display threads is ready
to write to the screen, it calls DosSemWait with a pointer to ScreenLock
and constant SEM_INDEFINITE_WAIT to indicate that the DosSemWait call should
block on the semaphore and not time out. If the ScreenLock semaphore is
clear, the wait function returns immediately. Otherwise, the thread blocks
until the semaphore clears. When the thread receives control again, it calls
DosSemSet to set the ScreenLock semaphore so other threads cannot interfere
with the display. When the thread completes the display update, it releases
the semaphore by calling DosSemClear.
The ShutDown routine in BOUNCE.C is called from both the keyboard thread
and the signal handler. The routine uses the ScreenLock semaphore to make
sure other threads do not write to the screen after the screen has been
cleared.
Screen displays and static data are only two of the resources requiring
careful management. For example, your program may have multiple threads
accessing the same file. Since another thread may have moved the file
pointer, each thread must reset the file pointer before reading or writing.
In addition, each thread must make sure that it is not preempted between the
time it positions the pointer and the time it accesses the file. These
threads should use a semaphore to coordinate access to the file by
bracketing each file access with DosSemRequest and DosSemClear calls. The
following code fragment illustrates this technique:
HSEM hsemIOSem;
DosSemRequest( hsemIOSem, SEM_INDEFINITE_WAIT );
fseek( fp, desired_position, 0L );
fwrite( data, sizeof( data ), 1, fp );
DosSemClear( hsemIOSem );
Thread Stacks
Stack checking is performed for each thread.
All of an application's default stack space is allocated to the first thread
of execution, which is known as thread 1. As a result, you must allocate
memory to provide a separate stack for each additional thread your program
needs. You must do this before creating the thread. Stack checking, if
enabled, is performed for each thread. The keyboard thread in BOUNCE.C calls
the malloc function each time the user wants to start a new display thread.
If the allocation is successful, the _beginthread function is called. The
first argument in the _beginthread call is a pointer to the BounceProc
function, which will execute the threads. The last argument is an ID number
that is passed to BounceProc. BounceProc uses the ID number to seed the
random number generator and to select the thread's color attribute and
display character.
Threads that make calls to the C run-time library or to the OS/2 API must
allow sufficient stack space for the library and API functions they call.
The C printf function requires more than 500 bytes of stack space, and you
should have 2K of stack space available when calling OS/2 API routines. To
be safe, allocate at least 4K for each thread's stack.
Use as little static data as possible.
Since each thread has its own stack, you can avoid potential collisions over
data items by using as little static data as possible. Design your program
to use automatic stack variables for all data that can be private to a
thread. The only global variables in the BOUNCE.C program are either RAM
semaphores or variables that never change once they are initialized.
Signal Handling
Signals are events that interrupt the normal flow of your program's
execution. They are similar to hardware interrupts, but they come from the
operating system or other programs and occur asynchronously. If you do not
provide your own routines, OS/2 will take the default action for each
signal, such as cancelling your program when the user enters CTRL+BREAK. You
can install your own signal handler with the OS/2 API function
DosSetSigHandler.
────────────────────────────────────────────────────────────────────────────
WARNING
The C run-time function signal is not supported in the multithread library
LLIBCMT.LIB.
────────────────────────────────────────────────────────────────────────────
When a signal occurs, OS/2 always suspends thread 1 and gives control to the
signal handler, if installed. As a result, thread 1 must not be executing C
run-time library code when the signal handler gets control or a potential
deadlock condition can occur. In addition, the signal handler must not call
C run-time library functions. Consider the following sequence of events:
1. Thread 2 is executing printf when the user interrupts it by pressing
CTRL+C. The program has designated a CTRL+C signal handler, so OS/2
immediately transfers control to the signal handler in thread 1.
2. The signal handler in thread 1 tries to execute the statement:
printf( "^C: Do you want to quit?" );
3. The printf call in thread 2 has already locked output to the console,
so thread 1's printf must wait for release of that lock.
4. The thread 2 printf function never regains control because the signal
handler must complete before other processing can continue. As a
result, it is never able to release the lock on console output.
If a situation like this happens, the program will wait indefinitely for
resolution of the two mutually exclusive conditions.
A multithread C program can process signals if it adheres to the following
restrictions:
■ Thread 1 must be dedicated to signal handling and must not call the C
run-time library once it identifies the signal handler to OS/2 using
the API function DosSetSigHandler. When the signal handler gets
control, it should set a semaphore or flag so other threads in the
program can determine that the signal has occurred and is being
processed.
■ The other threads in the process must check the status of semaphores
set by thread 1 and respond accordingly.
The BOUNCE.C sample program waits until thread 2, the keyboard handler,
starts before installing the signal handler. It then dedicates thread 1 to
signal handling by having the thread wait for a semaphore. Thread 1 blocks
until either the keyboard thread or the signal handler clears the semaphore.
It then calls _endthread to terminate the process, including all the other
threads.
15.4 Compiling and Linking
The steps for compiling and linking the multithread program BOUNCE.C are
given below:
1. Ensure that the files LLIBCMT.LIB and OS2.LIB are in the directory
specified in your LIB environment variable.
The file LLIBCMT.LIB takes the place of the regular C run-time library
files. The file OS2.LIB provides support for OS/2 system calls made in
the program, such as KbdCharIn.
2. Compile and link the program with the CL command-line option /MT.
The /Lp option instructs the compiler to create a protected-mode
application. The /MT option implies the large memory model with a
separate stack segment (/ALw). The multithread library functions have
their own data segment but use the caller's stack. This option also
sets the library search record to LLIBCMT.LIB and sets the MT symbolic
constant for the multithread versions of the include files. The /link
GRTEXTP option instructs the linker to search GRTEXTP.LIB, the
character-graphics library for protected mode.
To compile and link in a single step, use this CL command line:
CL /Lp /MT BOUNCE.C /link grtextp
For separate compile and link steps, you invoke the compiler and the
linker with this code:
CL /c /Lp /MT BOUNCE.C
LINK BOUNCE;
3. If you choose not to use the /MT option, you must take these steps:
■ Ensure that the special multithread include file support is enabled.
■ Use the /Aw option. This is required because the functions in
LLIBCMT.LIB have their own data segment but use the caller's
stack. The /Aw option specifies a segment setup of SS not equal to
DS with DS not reloaded on function entry.
■ Make sure that only far pointers are passed to library functions.
■ Make sure that all variables are either passed by value or cast to
a far address (the large memory model).
■ Specify the multithread library and suppress default library
selection.
The multithread include files are used when you define the symbolic
constant MT. You can do this with the CL command line option /D MT or
within the C source file before any include statements, as shown
below:
#define MT
#include <stdlib.h>
To compile and link in a single step with the default libraries
suppressed, this is the complete CL command line:
CL /Lp /ALw /Zl /D MT BOUNCE.C /link LLIBCMT+OS2
To perform a two-step compile and link with the default libraries
suppressed in the link step, use these commands:
CL /c /Lp /ALw /D MT BOUNCE.C
LINK /NOD BOUNCE,,,LLIBCMT+OS2;
1. Run the program under OS/2.
15.5 Avoiding Problem Areas
There are several problems you can encounter in creating, linking, or
executing a multithread C program. Some of the more common ones are
described here.
Problem Probable Cause
────────────────────────────────────────────────────────────────────────────
LINK searches for mLIBC f.LIB. If you omit the /NOD option from the
LINK command, LINK searches for the
default library. The
default library should not be used with
multithread programs. The /NOD option
tells the computer not to search the
default libraries. This problem can also
be avoided by compiling with the /Zl
option, which suppresses default library
search records in the object files.
You get error SYS1943. A program Many OS/2 programming errors cause
caused a protection protection violations. A common cause of
violation. protection violations is the indirect
assignment of data to null pointers.
This results in your program trying to
access memory that does not "belong" to
it, so a protection violation is issued.
Protection violations also occur if your
program gets a memory buffer from the
operating system and then tries to read
or write past the end of the
buffer. Another cause of this error is
failing to specify the condition "SS is
not equal to DS" in the CL command
invocation. Specify the correct
conditions with the /ALw memory model
option.
An easy way to detect the cause of a
protection violation is to compile your
program with CodeView information, then
run it in CodeView. When the protection
fault occurs, OS/2 will transfer control
to CodeView, and the cursor will be
positioned on the line that caused the
problem. See Chapter 9, "Debugging C
Programs with CodeView," for more
information about the CodeView debugger.
Your program generates numerous If you attempt to compile and link a
compile and link errors. multithread
program without defining the symbolic
constant MT, many of the definitions
required for the multithread library
will be missing. Define MT on the CL
command line with /MT or /D MT, or use
#define MT in your program.
You can eliminate many potential problems by setting the compiler's warning
level to one of its highest values and heeding the warning messages. By
using the /W3 or /W4 warning level options, you can detect unintentional
data conversions, missing function prototypes, and use of non-ANSI features.
15.6 Using the Protected-Mode CodeView Debugger
The protected-mode version of CodeView (CVP) has special commands for
debugging multiple processes and threads. It adds Thread and Process items
to the standard Run Menu. Your CONFIG.SYS file must specify IOPL=YES for
protected-mode CodeView to run.
To enable multiple process debugging, invoke CodeView with the /O
(offspring) option. Selecting the Process item from the Run Menu brings up a
list box of child processes associated with the parent process. You choose
the process to be debugged by selecting it with the list box. The Process
item will be grey (unselectable) if you did not specify the /O option. The
/O option applies only to debugging multiple processes. You do not need to
use it to debug multiple threads.
Selecting the Thread item from the Run Menu produces a list box showing the
status of each thread associated with the current process. You can use the
list box to designate a different current thread or to change a thread's
status. There are equivalent keyboard commands for each option.
15.6.1 Compiling with the /Zi Option
The compiler option /Zi causes the compiler to include symbolic information
and line numbers in the object file for debugging with CodeView. If you run
LINK in a separate step, you must invoke it with the /CODEVIEW option, which
can be abbreviated as /CO. To compile and link the sample program BOUNCE.C
in a single step, enter this code:
CL /MT /Zi BOUNCE.C
The following commands are for a two-step compile and link:
CL /c /MT /Zi BOUNCE.C
link /CO BOUNCE;
15.6.2 Prompt for Thread Number
When you debug a protected-mode program with CodeView, the command prompt is
preceded by a three-digit number indicating the current thread. Thread 1 is
always the current thread when you start a program. The prompt appears as
001>
15.6.3 Thread Commands
Protected-mode CodeView (CVP) has special commands to control the execution
of threads. The CodeView Thread commands are accessed using the Thread
command from the Run menu. Dialog commands for thread control start with the
tilde character (~). Thread commands specify which thread(s) the command
applies to, followed by the command. The syntax of the dialog version of the
Thread command is
~«specifier«command»»
Entering the tilde character by itself displays the status of all threads.
Enter the tilde and a specifier to see the status of particular threads.
Legal values for the specifier field are listed below:
Specifier Function
────────────────────────────────────────────────────────────────────────────
(blank) Displays the status of all threads
# Specifies the last thread that executed
. Specifies the current thread
* Specifies all threads
n Specifies the number of an existing
thread
The optional command field controls the way specified threads are executed.
If it is omitted, status is displayed, but thread activity is not affected.
Thread commands are summarized below, followed by examples. For more
information about command execution and about how other threads in the
process may be affected, consult on-line help.
Command Function
────────────────────────────────────────────────────────────────────────────
(blank) Display status
BP Set a breakpoint (used with the normal
Breakpoint Set command syntax)
E Execute in slow motion
F Freeze the thread(s)
G Pass control to a thread
P Execute a program step
S Select specified thread as the current
thread
T Trace a thread
U Unfreeze thread(s)
Controlling a Thread Being Debugged
If your program has multiple threads using the same functions, you may want
to monitor the behavior of one particular thread. The standard Breakpoint
Set command will affect every thread. The thread Breakpoint Set command lets
you limit the breakpoint to one or more threads. The sample program BOUNCE.C
has multiple threads executing the function BounceProc. This function erases
the symbol at the thread's current screen position, writes it to a new
location, computes the display coordinates to be used the next time the
thread receives control, and then sleeps to slow down the rate at which the
display is updated.
Since thread-specific breakpoints can only be set for threads that are
already running, you can set a breakpoint that will be executed after the
target thread starts. In BOUNCE.C, the source line in thread 2 that tests
each character received from the keyboard is a good location for such a
breakpoint (line 113). Since thread 2 is not active when the program begins,
you must first set a breakpoint in thread 1 after it has started thread 2
(line 73). The first breakpoint can be set by conventional methods or by
using the thread breakpoint command:
001>~1BP .73
Once you have reached the first breakpoint, you can set the keyboard test
breakpoint for thread 2:
001>~2BP .113
The BOUNCE.C program starts a new thread each time the letter `a' is typed.
(`A' is also accepted.) Once you have started the desired number of threads,
you can trigger the thread 2 breakpoint without starting a new thread by
pressing another key, such as the space bar. When you reach the breakpoint
in thread 2, you can set breakpoints for the other threads. To set a
breakpoint in thread 3's BounceProc function immediately after it has
updated the screen (source line 168), enter this code:
001>~3BP .168
When this breakpoint is reached, the CodeView prompt will reflect the
current thread number:
003>
You can then set other breakpoints for the thread, execute it in slow motion
without any other threads running in the background, or enter other CodeView
commands, such as Breakpoint Clear.
Freezing and Unfreezing Threads
Frozen threads do not execute.
It can be useful to freeze one or more threads so they don't interfere with
execution of a thread you are debugging. In the BOUNCE.C program, for
example, you can monitor the path of a single bouncing ball by freezing all
but one of the bounce threads. Frozen threads will not be scheduled for
execution.
If you have a large number of threads running, you can freeze all of them in
a single command and then unfreeze the threads you want to monitor. Unfrozen
threads continue to operate normally and will execute any breakpoints they
encounter. The following example freezes all threads, enables threads 1 and
4, and then checks the status of all threads:
001>~*F
001>~1U
001>~4U
001>~
If thread 1 is waiting for a semaphore when the status command is invoked,
the report shows the following:
001 Blocked
002 Frozen
003 Frozen
004 Runnable
Switching to a Particular Thread
The S (select) and E (execute) variations of the Thread command can be used
to switch the current thread. However, when another thread causes the
program to stop by hitting a breakpoint, the debugger will select the thread
that encountered the breakpoint as the current thread.
If you include ~.S in the breakpoint command, CodeView stops the thread that
encounters the breakpoint, then immediately switches back to the current
thread. The following example selects thread 4, sets a breakpoint at line
168 in thread 3, and switches to thread 4 when the breakpoint is hit:
001>~4S
001>~3BP .168 "~.S"
001>G
15.6.4 Screen Groups Used by CodeView
Only one CodeView session at a time is supported in protected mode. You
cannot run multiple copies in concurrent screen groups.
The View Output Screen command ( \ ) works differently in protected mode and
in real mode. In protected mode, your application's output will be displayed
for three seconds. The display will then revert to the CodeView display. To
view the output window for a longer period, specify a different delay
interval, measured in seconds, as follows:
\10
Chapter 16 Dynamic Linking with OS/2
────────────────────────────────────────────────────────────────────────────
An OS/2 dynamic-link library (DLL) is an executable file containing
functions that are available to other programs. In a statically linked
program, you link the program with all its component functions when you
build the executable file. In a dynamically linked program, the
program-build step does not link all of the code. Instead, OS/2 links calls
to functions in dynamic-link libraries at program load time or while the
program is running. The DLL code and data become part of the address space
of each program, even when the DLL is being accessed by several application
programs.
This chapter describes how to build your own dynamic-link libraries and how
to build programs that use them.
16.1 Overview of Dynamic Linking
Dynamic linking is the process of resolving external calls when a program
runs, instead of at link time. It offers several benefits:
■ Multiple programs can use the same dynamic-link library
simultaneously. Since only one copy of the DLL is in memory, there are
fewer demands for physical memory and swap space.
■ Updates to dynamic-link libraries do not affect the programs that use
them, since the only connection between DLLs and application programs
is the function-calling sequence.
■ Application programs require less disk space and memory, since their
executable program files contain the names of DLL functions but not
the code for the functions.
■ Dynamic-link libraries can call other dynamic-link libraries.
■ DLLs can extend the OS/2 operating system to provide new or improved
system services. This is possible because most of OS/2 consists of a
set of dynamic-link libraries.
16.1.1 Load-Time and Run-Time Linking
Dynamic linking can take place both at program load time and while the
program is running. A program can call functions in more than one DLL and
combine both load-time and run-time linking.
For load-time dynamic linking, build a program that calls DLL functions by
name.
The linker creates special records containing the name of each DLL
subroutine and the name of its DLL file. It does not put any DLL code into
the program's executable file. At load time, OS/2 dynamically links the
program and its DLLs. It brings the program and the DLLs into memory and
updates the program's DLL calls with the address of each DLL routine. If a
DLL is already in memory, it is not reloaded.
With run-time dynamic linking, the program creates the DLL file name and
subroutine names during execution. The program then passes these names to
OS/2 so the operating system can load the dynamic-link library.
An example of a run-time dynamic link is an extension to the Programmer's
WorkBench (PWB). PWB has no information about which extensions it needs
until it reads the initialization file, TOOLS.INI. PWB then sends requests
to OS/2 to demand-load the DLLs that it needs.
16.1.2 Application Programs and DLLs
With static linking, all library code is bound into the executable program
when you link the program. If the library changes, all programs using the
library must be relinked. With the exception of some Microsoft Windows
programs, all DOS programs use static linking.
Updates to parts of a program are easier to deliver using DLLs.
You can create loosely coupled applications and DLLs and modify the DLLs
without relinking the program. For example, if your product has an
underlying database access mechanism, you can package the database access
routines into a DLL. You can then ship improvements or changes to the
database code in a new dynamic-link library. The executable files for the
program do not have to be relinked or redistributed.
The programs calling a DLL are known as the DLL's "clients."
16.1.3 DLLs and Microsoft C Run-Time Libraries
You can construct three types of dynamic-link libraries with the Microsoft C
Professional Development System. All of them can be multithreaded; they can
support more than one client at a time. There are three types:
■ A stand-alone dynamic-link library that includes both your routines
and code for the Microsoft C run-time library functions used by your
DLL. This type of DLL is self-contained and completely independent of
the programs that call it.
■ A dynamic-link library that does not use any functions from the
Microsoft C run-time library. This type of DLL is also self-contained.
■ A private dynamic-link library that consists only of selected
functions from the Microsoft C run-time library. This DLL is usually
specific to one program or a closely tied group of programs.
Application programs and dynamic-link libraries using this DLL do not
contain any code for the C run-time library functions.
The following sections provide more information about the differences
between the various types of DLLs.
Stand-Alone Dynamic-Link Libraries
Stand-alone DLLs include C run-time functions.
If you want to call C run-time library functions in your DLL, you can
include the functions you need. These run-time functions are statically
linked in the DLL and the DLL does not rely on the client or any other DLL
for run-time support.
Figure 16.1 illustrates the relationships between this type of DLL, an
application program, and C run-time library functions. Both the application
program and the dynamic-link library have their own copies of functions from
the C run-time library. This ensures that
■ The DLL always has access to the C run-time library routines it needs.
■ The DLL is not dependent on the calling application for any support
code.
■ The programs using the DLL do not depend on the DLL for C run-time
library functions.
Section 16.3.1, "DLLs with Static C Run-Time Library Functions," describes
the steps involved in creating this type of dynamic-link library using the
special library LLIBCDLL.LIB.
(This figure may be found in the printed book.)
DLLs without C Run-Time Library Functions
You can write a dynamic-link library in C without calling any functions from
the C run-time library. Section 16.3.2, "DLLs without C Run-Time Library
Functions," shows how to set up this type of DLL. These DLLs contain only
your code and require no run-time library support; they make no calls to
run-time library functions.
Private C Run-Time DLLs
You can create a custom C run-time DLL.
A C run-time DLL can be shared by multiple programs and their DLLs. You
generate the C run-time DLL in two steps. The first builds a
module-definition file with a list of the C run-time library functions
needed by your application and its DLLs; the second step links the
module-definition file with the special library CDLLOBJS.LIB to create a C
run-time DLL.
The executable files for programs and DLLs linked with a customized C
run-time DLL do not contain any code for the C run-time library functions.
Figure 16.2 shows the relationships of the components.
A private C run-time DLL must be closely tied to its programs and
associated DLLs.
Processes and DLLs that share a private run-time DLL share environment
strings and global C run-time data (for example, file pointers for buffered
I/O and memory allocated with the malloc function). Therefore, the program
and the DLLs must cooperate on the use of this data.
(This figure may be found in the printed book.)
A closely tied structure is suitable for a complex application consisting of
a set of application programs that act as front-end processors to several
DLLs. A word processor, for example, might support one user interface for
beginners, another for intermediate users, and a third for expert users. The
different user interfaces could be implemented in three separate executable
program files. All three programs would share the DLLs that do most of the
real work.
Section 16.3.3, "Programs and DLLs with a C Run-Time DLL," describes the
procedures for building a C run-time library DLL and its associated programs
and dynamic-link libraries.
16.2 Designing and Writing DLLs
Before you write a DLL, you must determine some of the DLL's requirements.
You need to know
■ Floating-point math requirements
■ Special initialization requirements such as allocation of buffers or
registration of special termination routines
■ Termination requirements such as clearing semaphores or releasing
allocated memory
■ Re-entrancy requirements; if the DLL is to be called by more than one
process, it must be re-entrant
This section explains how to design a DLL to take these requirements into
account.
16.2.1 Floating-Point Math Requirements
Stand-alone DLLs built with the LLIBCDLL library are independent of the
programs calling them. They are "black boxes" that must operate without
knowing anything about their client programs and without interfering with
their clients.
One area of potential conflict for stand-alone DLLs is control of the 80x87
math coprocessor. For a DLL to use the 80x87 coprocessor or the emulator
floating-point library, the DLL and all of its client programs must agree on
which process is going to handle floating-point exceptions and on which
process is going to handle emulation if the machine does not have a
coprocessor.
Floating-point emulation is not possible with a genuinely independent DLL. A
stand-alone DLL must use the alternate math library, which ignores the math
coprocessor chip. The alternate math library provides the fastest processing
available without a coprocessor, but results are not as accurate as those
produced by the emulator floating-point library. Because the constraint
applies only to the DLL and not to applications, clients of a stand-alone
DLL can use any floating-point model. Since the DLL uses the alternate math
library, it does not conflict with clients over control of the math
coprocessor.
In contrast, DLLs and programs using a private C run-time DLL are tightly
coupled. This means that the floating-point math option is known when the
program is built. Because these programs and DLLs all use the same C runtime
functions (unlike the stand-alone DLL and its clients, which may incorporate
different C run-time libraries), no contention can arise over control of the
math coprocessor. The same floating-point math library is used for the
entire application.
The only way to use a math coprocessor within a DLL is with a private C
run-time DLL.
A private C run-time DLL uses the CDLLOBJS library and the emulator
floating-point package. The emulator uses the 80x87 math coprocessor if one
is installed; otherwise, it emulates the coprocessor. Floating-point
emulation produces the most accurate results. There is no conflict over use
of the coprocessor, since the C run-time DLL performs all floating-point
math. The programs and DLLs calling the C run-time DLL do not have any C
run-time library code of their own.
16.2.2 Initialization and Termination Requirements
When you design a DLL, you must decide if it has special initialization or
termination requirements. If the DLL needs to initialize variables or
allocate memory buffers when it starts, it needs custom start-up procedures.
If the DLL acquires system resources for a client program, the resources
must be released when the program completes its processing.
Initialization
All DLLs built with the Microsoft C run-time libraries must use per-process
initialization to set up the C run-time data. Per-process initialization
(also known as instance initialization) means that OS/2 calls the DLL's
initialization code each time it loads a program linked with the DLL. For
most DLLs, the default initialization routine is sufficient, and you do not
need to take any other measures.
If your DLL has special requirements, you must provide additional start-up
processing.
The C run-time library initialization function is called each time a new
client is attached to the DLL. To override the default initialization, you
must link your DLL with one of the following object modules, which are
provided with the Microsoft C Professional Development System:
File Name Description
────────────────────────────────────────────────────────────────────────────
DLLINIT.OBJ Initialization module for DLLs built
with LLIBCDLL.LIB and using C run-time
library code
CRTDLL_I.OBJ Initialization module for DLLs using a C
run-time DLL built with CDLLOBJS.LIB
(replaces CRTDLL.OBJ)
In addition, you must declare an entry point for your own DLL initialization
function. Your function, or the application program calling your DLL, must
initialize the C run-time data by calling the library function C_INIT before
any other C run-time library functions are called.
The prototype for C_INIT is
void _far _pascal C_INIT( void );
Designate your initialization function as the DLL's starting point.
To have your custom function recognized as the DLL's default initialization
routine, it must be the starting point for the DLL. This requires an
assembly language file with an END statement naming your function. The
sample file, SETENTRY.ASM, in the following example shows the minimum
assembler code required for specifying a C language function named
SampleInit as the DLL's entry point.
; SETENTRY.ASM
extrn _SampleInit:FAR ;name of C start-up routine
end _SampleInit
The following example, SAMPLE.C, shows a simple custom initialization
routine that maintains a count of how many clients it is currently serving.
Since this example overrides the default dynamic-link library
initialization, it must return a nonzero status code to OS/2 to show a
successful start-up. If a DLL initialization function returns a status of 0,
OS/2 will not load the program using the DLL.
/* SAMPLE.C */
void _far _pascal C_INIT( void );
int UserCount = 0;
int _export _loadds SampleInit()
{
UserCount++; /* increment number of users */
C_INIT(); /* initialize C run-time data */
return( 1 ); /* indicate successful start */
}
/* code for other DLL functions belongs here */
All DLLs must be linked with a module-definition file that contains a
LIBRARY statement, such as the following:
LIBRARY SAMPLE INITINSTANCE
The following commands will create object files from the sample files and
link them with DLLINIT.OBJ to make a stand-alone dynamic-link library named
SAMPLE.DLL. The /ML compile option, explained in Section 16.2.6, "Compile
Options for Dynamic-Link Libraries," sets the library search record to
LLIBCDLL.LIB.
MASM /Mx SETENTRY;
CL /c /Gs /ML SAMPLE.C
LINK /NOE DLLINIT+SETENTRY+SAMPLE,SAMPLE.DLL,,,SAMPLE.DEF;
────────────────────────────────────────────────────────────────────────────
WARNING
For DLLs linked with Microsoft C run-time libraries, the LIBRARY statement
in the DLL's module-definition file must specify INITINSTANCE in the
initialization field. If you omit this, the initialization routine is called
only when the DLL is loaded into memory for the first client program, and
the DLL will not function properly if it is called by additional programs.
────────────────────────────────────────────────────────────────────────────
Termination
You may have to clean up before terminating.
You may need to know when an application using your DLL is finished. If your
DLL has created buffers, semaphores, or other resources for a particular
application, they must be released when the application terminates.
You can have an initialization routine in your DLL that calls the OS/2 API
function DosExitList to register one or more exit subroutines for your DLL.
OS/2 will call the exit routines when the client program finishes. The exit
functions should free any resources your DLL acquired for the client
program.
DLLs built with LLIBCDLL.LIB have a default termination routine.
The start-up routine for dynamic-link libraries built with the LLIBCDLL
library calls DosExitList with a pointer to a default termination function.
To replace the default processing with your own function, link the module
DLLTERM.OBJ into the DLL. This suppresses the call to DosExitList. During
initialization, your DLL must register its own routine by calling
DosExitList unless you are sure the termination routine will be called
explicitly. The termination processing must include a call to the library
function C_TERM.
The prototype for C_TERM is
void _far _pascal C_TERM( void );
There is no equivalent to DLLTERM.OBJ and C_TERM for DLLs using a private C
run-time DLL built with the CDLLOBJS library. If special cleanup processing
is required, these DLLs must provide their own termination function. The
function is registered during initialization by calling either the C
run-time library function atexit or the OS/2 API function DosExitList.
Any DLL that calls DosExitList should also have a termination function.
DLLs that set exit lists must provide termination functions that can be
called by clients when they no longer need the DLL. If a program attaches
itself to the DLL at run-time (using DosLoadModule), it cannot disconnect
from the DLL as long as the exit list points to a function in the
dynamic-link library. The DLL's termination function can perform any
necessary cleanup and call DosExitList to remove itself from the exit list.
────────────────────────────────────────────────────────────────────────────
NOTE
There is no special termination procedure for DLLs build with CDLLOBJS.LIB
because the C run-time termination code is called by the exit or _exit
functions. If the process is terminated by a critical error or DosExit, C
run-time termination does not occur.
────────────────────────────────────────────────────────────────────────────
16.2.3 Making the DLL Re-Entrant
Re-entrant code is code that can be shared by multiple programs in a
multitasking environment. DLLs that may be used by more than one program
must be re-entrant. To do this, they must isolate each client program's data
and resources. File handles belonging to one client, for example, must not
be used for other clients. Re-entrancy also means that the DLL cannot allow
itself to be switched to a different thread while it is performing certain
operations.
Global Versus Instance Data
A dynamic-link library can have separate data segments for each program
that calls it.
Separate data segments are known as "instance" data. With instance data
segments, the DLL does not have to keep track of which resources belong to
each client. OS/2 assigns a different data segment to each process calling
the DLL, even though the selectors are the same.
A dynamic-link library can also have a global data segment used for internal
purposes or to support all of the programs using its services.
A DLL providing time and date conversions might, for example, keep the
current date in a global storage area. The same DLL might provide functions
to compute elapsed time, such as the number of minutes between two clock
readings. If static variables are used by the elapsed time functions, they
should be in instance data segments, since the OS/2 scheduler might preempt
the function and schedule another thread that calls the same function with
different arguments before it has completed the first caller's task.
Data sharing is controlled by DATA and SEGMENTS statements in a dynamic-link
library's module-definition file. By default, a DLL's automatic data segment
(the local stack and heap) is shared by all processes calling the DLL. You
can specify a unique automatic data segment for each client process by
specifying DATA MULTIPLE.
────────────────────────────────────────────────────────────────────────────
WARNING
DLLs built with the LLIBCDLL or CDLLOBJS C run-time libraries must use DATA
MULTIPLE in the module-definition file.
────────────────────────────────────────────────────────────────────────────
You can use SEGMENTS to specify attributes on a segment-by-segment basis.
Using the SEGMENTS statement allows you to have both global and per-process
(instance) data in the same DLL. The C run-time data segment must be
per-process. The following is an example of a C program fragment and
moduledefinition file that implement both instance and global data:
/* Define static data in the shared segment SHR_SEG */
int _based(_segname("SHR_SEG")) intvar;
char _based(_segname("SHR_SEG")) charvar;
In the module-definition file, define all data segments as nonshareable,
then override that default for SHR_SEG as follows:
DATA MULTIPLE NONSHARED
SEGMENTS
SHR_SEG CLASS 'FAR_DATA' SHARED
Global data segments are created when OS/2 brings the dynamic-link library
into memory for its first client process. All of the processes calling the
DLL share the same global variables.
Serializing Nonatomic References
An atomic operation is an operation that can be completed in one machine
language instruction. When writing a re-entrant procedure (in a multithread
program or in a DLL), you must ensure that changes to static or global data
are not preempted by the OS/2 scheduler before the update is complete. To
prevent this, you must explicitly serialize nonatomic references to static
or global data. The following code example is safe from preemption, because
incrementing an integer requires only one machine instruction:
int int_var;
_export _loadds void _far _pascal dynlink_proc( void )
{
int_var++;
}
The following variation on the same function is not safe because
incrementing a long variable is not atomic; it requires two machine
instructions. Between incrementing the least-significant word and the
most-significant word, another thread could gain control of the processor.
If that thread executes code in your DLL that uses long_var, that data
would be in an indeterminate state.
long long_var;
_export _loadds void _far _pascal dynlink_proc( void )
{
long_var++;
}
Critical Code Sections
A critical code section is a section of code that manipulates a resource
(such as the long variable in the previous example) while blocking all other
threads. When your program enters a critical section, it cannot be preempted
until it performs a DosExitCritSec or until a signal is received. You don't
usually just alter the value of a variable; you alter it and then use it
later. In this case, you must isolate the smallest group of operations that
must occur without interruption. You define these sections with the
DosEnterCritSec and DosExitCritSec OS/2 API functions, as in the following
example:
_export _loadds void _far _pascal dynlink_proc( void )
{ static int_var;
DosEnterCritSec();
int_var += 7;
SetLeftCorner( int_var, int_var );
DosExitCritSec();
/* Code that does not reference int_var */
}
Keep your critical sections as short as possible.
While in a critical section, all other threads in the process are blocked
from execution. Writing extremely long critical sections can make your
program inefficient and can degrade system performance.
Although other threads are blocked from execution by DosEnterCritSec and
DosExitCritSec, these functions do not block signal handling.
Note that static variables in DLLs are protected from interference from
other processes if they are in an instance data segment designated as
MULTIPLE in the DATA statement of the DLL's module-definition file. Memory
is "owned" by a process and, unless specifically allocated as shareable,
cannot be altered by any other process.
16.2.4 Signal Handling
The C library function signal is not supported for multithread programs or
for DLLs. If you need to process signals, use the OS/2 API signal functions,
such as DosSetSigHandler.
See Chapter 15, "Creating Multithread OS/2 Applications," for more
information about signal handling in OS/2 programs.
16.2.5 Using Microsoft C Keywords
The _export and _loadds keywords simplify writing DLLs. They are used to
define or declare functions or pointers to functions. In the DLL, an
exported function with a single argument might be defined as
int _export _loadds sample( int )
The _export Keyword
All DLL functions that will be called from outside the library must be
exported.
The _export keyword gives a function the export attribute. Stack checking
must be disabled for exported entry points. You can use the /Gs compile
option or the check_stack pragma to accomplish this.
Using the _export keyword is an alternative to declaring the name of the
function in the EXPORTS section of a module-definition file. It assigns
certain default attributes: no I/O privilege, shared data, load on demand,
and no alias name. If the defaults are not acceptable, you must specify the
proper attributes in the module-definition file.
Not all functions in a DLL are for external use. A DLL can have any number
of utility subroutines supporting the work of the exported functions.
Functions that are private to the DLL should not have the _export keyword.
The _loadds Keyword
At entry to a DLL, the DS (data segment) register points to the calling
program's data segment. To access the DLL's data, the DS register has to be
loaded with the DLL's segment selector. The _loadds keyword causes the
compiler to add prolog and epilog code to the function. The prolog code
initializes the DS register to point to the function's data group. The
epilog code restores the caller's DS register when the function terminates.
Since loading the DS register is a high overhead operation, you should limit
the use of _loadds to the exported functions in your DLL.
────────────────────────────────────────────────────────────────────────────
WARNING
Do not use the _loadds keyword in a function definition if the function uses
only stack variables. If you specify _loadds in a DLL that does not have any
static data, the linker will issue a segment fix-up error.
────────────────────────────────────────────────────────────────────────────
16.2.6 Compile Options for Dynamic-Link Libraries
Dynamic-link libraries must be compiled with specific options that control
linking, memory models, and library selection.
Compile without Linking (/c)
You must use the /c option to build your DLL in separate compile and link
steps. This is necessary because the DLL must be linked with a
module-definition file specifying that the output file is a dynamic-link
library. (The compiler does not pass module-definition file names to the
linker.) The /c option is automatically specified in the makefile generated
by the Programmer's WorkBench.
Large Memory Model with Separate Stack (/ALw)
The /ALw option instructs the compiler to use the large memory model with a
separate stack segment. Because all DLLs use the caller's stack, you must
use /Aw or /Au. The /Aw option sets up separate stack and data segments but
does not cause the DS register to be reloaded at the entry to each function.
This allows you to call private functions (functions that you do not export)
without incurring the overhead of loading the DS register. Functions that
you do export must also be declared using the _loadds keyword, described
above, which sets up the proper DS register handling. If you use the /Au
option, the DS register will be reloaded on entry to every function, which
can cause the function calls in your DLLs to execute more slowly.
All DLL functions are reached using far calls. Pointers passed to and from
the DLL must be far pointers.
Remove Stack Probes (/Gs)
Since the DLL uses the caller's stack, you should usually use the /Gs option
to disable stack checking within the DLL.
Specify 80286 Code (/G2)
Use the /G2 option to designate code generation for the 80286 processor
instruction set, since OS/2 runs only on 80286 and higher model processors.
Link C Run-Time into Stand-Alone DLL (/ML)
Use the /ML option to build a stand-alone dynamic-link library that includes
static code for C run-time library functions. This option has the same
effect as using the /ALw, /FPa, /G2, and /D MT options. It changes the
library search record to LLIBCDLL.LIB. See Section 16.3.1, "DLLs with Static
C Run-Time Library Functions" for more information about these options.
Link Executable or DLL with C Run-Time DLL (/MD)
Use the /MD option to build an executable file or a dynamic-link library
that calls a C run-time DLL. This option has the same effect as using the
/ALw, /FPi, /G2, /D DLL, and /D MT options. It inhibits library search
records. See Section 16.3.3, "Programs and DLLs with a C Run-Time DLL," for
more information about these options.
Suppress Default Library Selection (/Zl)
If you do not compile with the /MD or /ML options described above, compile
with the /Zl option or use the /NOD option when you link in order to inhibit
searches for default libraries.
16.3 Building DLLs with Microsoft C
Building a DLL for OS/2 is like building an executable program file.
To build a DLL, compile and link the dynamic-link library like any other
executable file, but add a module-definition file. This module-definition
file tells the linker that the output is a dynamic-link library.
When you build applications that use a dynamic-link library, you must tell
the linker where to find the library's dynamically linked functions. You use
import libraries and module-definition files for this purpose.
16.3.1 DLLs with Static C Run-Time Library Functions
The LLIBCDLL library is used to create stand-alone DLLs. The library
functions are re-entrant and can be called by multiple threads within a
program as well as by multiple programs. The code for the stand-alone DLL's
C run-time library functions is contained within the DLL. Programs that call
stand-alone DLLs have their own run-time library code.
Building the DLL
The files required to build a stand-alone DLL with the LLIBCDLL library are
listed below:
File Name Description
────────────────────────────────────────────────────────────────────────────
OS2.LIB OS/2 kernel import library
LLIBCDLL.LIB Large-model multithread C run-time
library for DLLs
DLLINIT.OBJ Optional initialization module for DLLs
requiring custom initialization
DLLTERM.OBJ Optional termination module for DLLs
requiring custom exit processing
userdll.C Source code for the DLL you create
userdll.DEF Module-definition file for the DLL you
create
The module JUSTIFY.C, below, is an example of source code for a simple
dynamic-link library. The RightJustify routine calls the strlen function
from the C run-time library and right-justifies a caller's buffer. The
function definition includes the _export keyword. The _loadds keyword is
omitted, since this function does not need any static data. If it did, you
would need to specify _loadds.
For simplicity, JUSTIFY.C below shows a DLL with a single function. In
actual practice, you would usually package a group of similar utilities into
one DLL.
/* JUSTIFY.C -- Sample Dynamic-Link Library */
#include <string.h>
/* Right justifies the string in TargetBuff to TargetSize
* and inserts necessary number of FillChars on the left.
*/
#pragma stack_check(off)
int _export RightJustify( char *TargetBuff, int TargetSize,
char FillChar)
{
char *s, *d;
s = TargetBuff + strlen( TargetBuff );
d = TargetBuff + TargetSize;
while ( s = TargetBuff )
*d-- = *s--;
while ( d = TargetBuff )
*d-- = FillChar;
return( 0 );
}
The steps for creating a stand-alone dynamic-link library with JUSTIFY.C are
given below. The DLL in the example is named JUSTLIB1.DLL.
■ Compile with the /ML Option.
Compile the source file without linking. Dynamic-link libraries linked
with LLIBCDLL must be compiled with specific options.
Use the /ML option to set the library search record to LLIBCDLL.LIB
and to indicate that C run-time code is to be included in the DLL.
When you use /ML, the following options take effect:
Option Effect
────────────────────────────────────────────────────────────────────────────
/ALw Use large memory model with separate
stack
segment
/G2 Use 80286 processor instruction set
/D MT Use the multithread version of the
include files
/FPa Generate floating-point calls and select
the alternate math library
The /G2 and the /ALw options can be overridden.
You should also use the /Gs option to suppress stack checking and the
/c option to compile without linking. The complete command to compile
the sample file JUSTIFY.C is
CL /ML /Gs /c JUSTIFY.C
■ Create a module-definition file.
Create a module-definition file, JUSTLIB1.DEF, which includes the
following lines:
LIBRARY JUSTLIB1 INITINSTANCE
DATA MULTIPLE
The LIBRARY statement identifies the executable file, JUSTLIB1.DLL, as
a dynamic-link library. DLLs linked with the LLIBCDLL library must
specify INITINSTANCE in the initialization field. You could add an
EXPORTS statement for the RightJustify function in JUSTIFY.C, but it
is optional since the _export keyword was used in the source code.
See Chapter 14, "Building OS/2 Applications," for more information
about module-definition files.
■ Link with LLIBCDLL.LIB.
Ensure that the file LLIBCDLL.LIB, which takes the place of the
regular C run-time library, is available.
Create JUSTLIB1.DLL with a command such as
LINK justify,justlib1.dll,,,justlib1.def/NOI
────────────────────────────────────────────────────────────────────────────
WARNING
When you link with LLIBCDLL, you cannot have any other C run-time libraries
in the link.
────────────────────────────────────────────────────────────────────────────
■ Create an import library.
Applications that call DLLs use import libraries to identify DLL
functions to the linker. The following example uses JUSTLIB1.DLL and
the IMPLIB utility to create an import library named JUSTLIB1.LIB.
IMPLIB justlib1.lib justlib1.dll
For more information about import libraries, see Chapter 14, "Building
OS/2 Applications."
Building Programs that Call the DLL
To link a dynamic-link library with an application, you must have one of the
following:
■ A module-definition file with an IMPORTS statement for each DLL
function called by your program
■ An import library created from the DLL itself or from a
module-definition file
All calls to DLLs must be far calls; all pointers passed must be far data
pointers. If you do not compile with the large memory model option (/AL),
you must cast the DLL function calls and pointers yourself.
The sample file below, TESTJUST.C, is compiled and linked into a small-model
program named SAMPLE1.EXE. TESTJUST.C includes a function prototype that
declares RightJustify as a far function expecting a far pointer as its first
argument. Because of the prototype, the compiler will generate a far call to
RightJustify and coerce the pointer argument to the proper value.
/* TESTJUST.C. Call sample DLL library */
#include <stdio.h>
#include <string.h>
/* DLL function prototype */
int _far RightJustify( char _far *, int, char );
void main( void )
{
char buff[12];
strcpy( buff, "ABCD" );
/* Right justify to 8 characters and zero fill. */
RightJustify( buff, 8, '0' );
printf( "Result: %s\n", buff );
}
You need several files to link an application with a stand-alone DLL:
File Name Description
────────────────────────────────────────────────────────────────────────────
userdll.LIB Import library file for the DLL
userapp.DEF Optional module-definition file for your
application that contains an IMPORTS
statement for each DLL function called
(required if not using an import
library)
OS2.LIB Optional import library file for the
OS/2 kernel (required if your
application calls the kernel directly or
via a C run-time library function)
userapp.OBJ Object module(s) for your application
mLIBC f P.LIB Regular C run-time library for protected
mode, where m indicates memory model (S,
C, M, L) and
f indicates math package (A, E, 7)
The following command lines illustrate how TESTJUST.C can be compiled and
linked with the standard libraries, plus the sample dynamic-link library,
JUSTLIB1.DLL. The example uses the small memory model library and the
JUSTLIB1.LIB import library created from JUSTLIB1.DLL to create SAMPLE1.EXE.
CL /AS /G2 /c TESTJUST.C
LINK TESTJUST,SAMPLE1.EXE,,JUSTLIB1;
Make sure that the JUSTLIB1.DLL file is in a directory on your LIBPATH
before executing SAMPLE1.EXE.
16.3.2 DLLs without C Run-Time Library Functions
Building a DLL that does not call any of the C run-time library functions is
similar to creating a stand-alone DLL.
To use the JUSTIFY.C sample program shown in Section 16.3.1, "DLLs with
Static C Run-Time Library Functions," without calling C run-time functions,
one change must be made. You must remove the call to the C run-time library
function strlen. The strlen function was used in the sample program to
calculate a pointer to the end of the caller's buffer. Remove the following
line in the program JUSTIFY.C:
s = TargetBuff + strlen( TargetBuff );
Replace the line above with the following code fragment, which does the same
thing without calling strlen:
s = TargetBuff;
while ( *s )
s++;
After making this change, you can use the following commands to create a DLL
named JUSTLIB2.DLL and its import library:
CL /c /ALw /G2s /Zl JUSTIFY.C
LINK JUSTIFY,JUSTLIB2.DLL,,,JUSTLIB2.DEF/NOI
IMPLIB JUSTLIB2.LIB JUSTLIB2.DLL
Note that object modules compiled with releases of Microsoft C prior to
Version 6.0 refer to the C run-time library variable _acrtused. C 6.0
defines this variable if the main function is present. This causes the
linker to automatically add the C run-time start-up module to the DLL. To
suppress the start-up module, your source file must include a line defining
_acrtused as follows:
int _acrtused = 0;
This is required only if you do not use a C run-time library and if the link
includes object modules built with earlier versions of the compiler.
16.3.3 Programs and DLLs with a C Run-Time DLL
The CDLLOBJS.LIB and CDLLOBJS.DEF files are the foundation for building a
DLL that consists only of C run-time library functions. The application
programs and optional dynamic-link libraries linked with this DLL do not
contain any C run-time library code.
You create an application to use the C run-time DLL in either two or three
phases, depending on whether or not the application has additional DLLs:
■ Build a C run-time DLL.
■ Build any optional DLLs that use the C run-time DLL.
■ Compile and link the application.
The examples in this section use the JUSTIFY.C and TESTJUST.C source files
shown in Section 16.3.1, "DLLs with Static C Run-Time Library Functions."
Building a C Run-Time DLL
The C run-time DLL is derived from the CDLLOBJS.LIB and CDLLOBJS.DEF files
provided with the Microsoft C Professional Development System. The
CDLLOBJS.DEF file includes export definitions for all of the C run-time
library functions.
The steps for creating a C run-time DLL are given below. The C run-time DLL
in the example is named CEXAMPLE.DLL.
1. Create a module-definition file.
You can use CDLLOBJS.DEF as the basis for your own module-definition
file by copying and editing it. This allows you to create a customized
DLL that contains only the functions your application requires. If you
use the CDLLOBJS.DEF file without modification, every program that
links to your C run-time DLL will get the entire C run-time library.
The following examples create the sample file CEXAMPLE.DEF to define
the custom dynamic link library CEXAMPLE.DLL. The CEXAMPLE.DEF file,
shown below, exports the three C run-time library functions called
from JUSTIFY.C and TESTJUST.C. It also exports functions required by
the C run-time library start-up modules.
LIBRARY CEXAMPLE INITINSTANCE
DESCRIPTION 'Sample Dynamic-link C Run-Time Library'
DATA MULTIPLE
PROTMODE
EXPORTS
_printf
_strlen
_strcpy
__CRT_INIT
__aFchkstk
_exit
2. Create the C run-time DLL.
The files for creating a C run-time DLL are listed below:
File Name Description
────────────────────────────────────────────────────────────────────────────
OS2.LIB Import library for the OS/2 kernel
CDLLOBJS.LIB Dynamic link C run-time library
CRTLIB.OBJ Start-up code for C run-time DLL
yourclib.DEF Module-definition file specifying C
run-time library functions for the DLL
The command to create the sample CEXAMPLE.DLL file is
LINK /NOD /NOE /NOI crtlib.obj,cexample.dll,,cdllobjs+os2,cexample.def
3. Create an import library.
You need to create a library file of import definitions that can be
used by programs that will be linked with your custom DLL. This is a
two-step process. The first phase uses the module-definition file and
the IMPLIB utility to create an interim version of the library, as in
this example:
IMPLIB cexample.lib cexample.def
Note that the IMPLIB utility accepts either a module-definition file
or a DLL as input.
The second step uses the LIB utility to append the file CDLLSUPP.LIB
to the import library. You must append CDLLSUPP.LIB because it
contains some routines that cannot be dynamically linked. The LIB
utility requires the full path name for CDLLSUPP.LIB. If it is in a
directory named C:\ LIB, the command to complete the library build for
CEXAMPLE.LIB is
LIB CEXAMPLE.LIB+C:\LIB\CDLLSUPP.LIB;
When you have finished building the custom DLL, be sure to copy it to
a directory specified in the LIBPATH statement of the CONFIG.SYS file.
Building an Application-Specific DLL
You must compile a DLL that calls a C run-time DLL with specific options
and link it with the C run-time DLL's import library. The steps for building
an application-specific DLL named JUSTLIB3.DLL are given below.
1. Compile with the /MD option.
The easiest way to be sure you choose the proper options is to use the
/MD switch, which indicates that the DLL will be used with a C
run-time DLL. When you use /MD, library search records are suppressed
and the following options are in effect:
Option Effect
────────────────────────────────────────────────────────────────────────────
/ALw Use large memory model with separate
stack
segment
/G2 Use 80286 processor instruction set
/D MT Use the multithread version of the
include files
/D DLL Use a C run-time dynamic-link library
/FPi Generate in-line floating-point
instructions and select the emulator
math package
The /G2 and /ALw options can be overridden. The FPi option can be replaced
with /FPi87 or /FPc, but not with /FPa. See Chapter 4, "Controlling
Floating-Point Math Operations," for more information about compatible
floating-point options.
You should also use the /c option to compile without linking. The
command line to compile the sample file JUSTIFY.C is
CL /MD /c JUSTIFY.C
2. Create a module-definition file.
Create a module-definition file named JUSTLIB3.DEF that includes the
following line:
LIBRARY JUSTLIB3 INITINSTANCE
3. Link the DLL with the C run-time and OS/2 import libraries.
To create a DLL that will call a C run-time DLL, the following files
must be linked together:
File Name Description
────────────────────────────────────────────────────────────────────────────
OS2.LIB Import library for the OS/2 kernel
yourclib.LIB Import library for your C run-time DLL
CRTDLL.OBJ Start-up code for DLLs using a C
run-time DLL
CRTDLL_I.OBJ Optional initialization module for DLLs
requiring custom initialization
(replaces CRTDLL.OBJ)
yourdll.OBJ Object file for your DLL
yourdll.DEF Module-definition file for your DLL
The command for linking these files to create JUSTLIB3.DLL is
LINK justify+crtdll,justlib3.dll,,cexample+os2,justlib3.def
4. Create an import library.
Use JUSTLIB3.DLL and the IMPLIB utility to create an import library
file, JUSTLIB3.LIB, for use by applications calling JUSTLIB3.DLL:
IMPLIB JUSTLIB3.LIB JUSTLIB3.DLL
Remember to copy JUSTLIB3.DLL to a directory named in the LIBPATH
statement in the CONFIG.SYS file.
Using C Run-Time and Application-Specific DLLs
Application programs using a C run-time DLL, such as the sample program
CEXAMPLE.DLL (described earilier in this section), must define the symbolic
constants MT and DLL. These constants cause the compiler to use the
multithread and DLL sections of the include files. You can define the
constants in your source code or with the compiler's /D command-line option.
Since the C run-time DLL uses the large memory model, your program must
either use the same model or declare all C run-time functions and pointers
passed to them as _far. If you use the standard include files for the C
run-time functions in your program, all these declarations are made for you.
The following files are required to link an application that calls a C
runtime DLL:
File Name Description
────────────────────────────────────────────────────────────────────────────
OS2.LIB Import library for the OS/2 kernel
yourclib.LIB Import library for your C run-time DLL
yourdll.LIB Import library for each optional
application DLL
CRTEXE.OBJ Start-up code for executable files
calling a C run-time DLL
yourapp.OBJ Object file(s) for your application
yourapp.DEF Optional module-definition file for your
application
The following commands compile and link the TESTJUST.C file from Section
16.3.1 for use with the dynamic-link libraries CEXAMPLE.DLL and
JUSTLIB3.DLL. The link command uses the /NOD option to suppress selection of
the standard large-model library. The result is a program named SAMPLE2.EXE.
CL /AL /D MT /D DLL /G2 /c TESTJUST.C
LINK /NOD TESTJUST+CRTEXE,SAMPLE2.EXE,,CEXAMPLE+OS2+JUSTLIB3;
16.3.4 Using CodeView to Debug Dynamic-Link Libraries
The protected-mode version of CodeView (CVP) supports debugging of
dynamic-link libraries. The /L option lets you name one or more DLLs to be
debugged with your application.
To enable full symbolic debugging, use the CodeView options /Zi when
compiling and /CO when linking. Do this for both the DLL to be debugged and
for the program that calls the DLL.
The syntax for the /L CodeView option is
/L file
At least one space must separate /L from the file name(s). You can enter
multiple DLL names. To debug the JUSTLIB3.DLL dynamic-link library and the
SAMPLE2.EXE program discussed in the previous section, use this command
line:
CVP /L JUSTLIB3.DLL SAMPLE2.EXE
Use the CodeView Trace command (F8) to enter and view DLL code.
A simple way to use CodeView is to place a breakpoint at the instruction
that calls the DLL function you want to debug. When you reach the
breakpoint, press F8 to execute the current source line. CodeView will then
display the DLL function's source code, allowing you to set additional
breakpoints and enter other CodeView commands.
Appendix A Using Exit Codes
────────────────────────────────────────────────────────────────────────────
When C programs terminate, they return values to the process that started
them. These values are called "exit codes." The process that starts a C
program can be either an operating system, such as DOS or OS/2, or another
program. The process that starts the C program is referred to as the "parent
process"; the program started is referred to as the "child process." The
parent process can interpret return values as an error code sent to the
operating system or use those return values as a form of interprocess
communication (communication between two separate processes).
A.1 The exit Function
The exit function terminates execution of your C program and returns an exit
code (an integer value) to the parent process. The parent process can be the
operating system or another program, depending on how the child process was
executed. Note that a C program always returns an integer, regardless of how
you declare the main function.
Most programs use exit codes to communicate errors to the parent process;
these are called "error codes." By convention, programs return zero if they
complete normally and a nonzero value if they are exiting because of an
error. This error code (the nonzero value) can then be used by the operating
system to control the execution of other programs (for example, from inside
a batch file).
The Microsoft C compiler is a good example of a program that returns an exit
code. It returns 0 if no errors occur in your compile and a positive value
if an error occurs during compilation.
The following program attempts to open a file for reading. If the file
cannot be opened, exit returns 1 to the calling program. Therefore, 1 and 0
are both exit codes.
#include <stdio.h>
int main(void)
{
FILE * fp;
if( !(fp = fopen( filename, "rb" )) )
{
printf("Error %d: Could not open file\n", errno);
exit(1);
}
do_file_access(fp);
}
In the preceding example, the exit code is unpredictable because the exit
function is not used. The value actually returned to the parent process (or
to the operating system shell) is whatever happens to be in the AX register
when the program terminates─in this case, whatever do_file_access
returned.
A.2 Testing Exit Codes from Command and Batch Files
Using the IF ERRORLEVEL command, you can test to see if a program has
executed successfully by checking its exit code. The IF ERRORLEVEL command
is an OS/2 command file or DOS batch file command that tests the exit code
of the most recently executed program.
IF ERRORLEVEL can help you organize program execution. For example, you can
define program execution to be dependent on the successful exit code testing
of earlier programs by IF ERRORLEVEL. You can also use the value of the exit
code to branch to different commands in a batch or command file.
When placed in a batch or command file, the following commands will execute
REPORTS.EXE only if FILEMNG.EXE does not return an error:
echo Running file manager....
FILEMNG.EXE
IF NOT ERRORLEVEL 1 REPORTS.EXE
Despite the name ERRORLEVEL, the exit code does not always denote an error.
You can define error codes to communicate any information useful to you.
Refer to the Microsoft Operating System/2 User's Guide or the Microsoft
MS-DOS User's Guide and User's Reference for more information about the IF
ERRORLEVEL command.
A.3 Accessing Exit Codes from Other Programs
When you use any of the spawn family of functions to run a program as the
child of another program, the return value of spawn is the exit code of the
function. The following code performs the same function as the batch file in
Section A.2:
void main( void )
{
if( !spawnl( P_WAIT, "filemng.exe", "filemng.exe",
NULL ) )
spawnl( P_WAIT, "reports.exe", "reports.exe",
NULL );
}
The program reports.exe is executed only if the program filemng.exe
terminates with an exit code of 0.
The following code uses the exit code as part of a simple menu system:
void main(void)
{
int option;
int menu_num = 0; /* Initialize for first execution */
while( (option = spawnl( P_WAIT, "menu.exe",
"menu.exe", menu_num, NULL )) )
{
switch( option )
{
case 1 :
menu_num = spawnl( P_WAIT, "program1.exe",
"program1.exe", NULL );
break;
case 2 :
menu_num = spawnl( P_WAIT, "program2.exe",
"program2.exe", NULL );
break;
case 3 :
menu_num = spawnl( P_WAIT, "program3.exe",
"program3.exe", NULL );
break;
default: /* Guard against a bad option */
break;
}
}
}
The preceding example demonstrates how you could have a program, menu.exe,
that solicits input from a menu of choices. This input is interpreted and
passed back to the main program in the form of an exit code. (The spawnl
function returns the value of the child process's exit code.) This exit code
value is stored in option, which is used as a selector variable in a switch
statement.
Based on the value returned from menu.exe, the main program executes
program1.exe, program2.exe, or program3.exe. Finally, menu_num, the exit
code of the program selected, is used as a parameter to the next execution
of menu.exe.
Appendix B Differences between C Versions 5.1 and 6.0
────────────────────────────────────────────────────────────────────────────
This appendix describes the differences between versions 5.1 and 6.0 of
Microsoft C, including additions, deletions, and changes. Some of the
changes are required by the American National Standards Institute (ANSI)
draft standard for the C programming language. Other changes improve or
augment the existing capabilities of the compiler.
Many of the changes will have no effect on code that was written and
compiled with previous versions of Microsoft C. In some cases, however, you
may have to modify or correct existing code before compiling with version
6.0.
B.1 Modifications for ANSI Compatibility
A number of changes have been made to the compiler to support the ANSI draft
standard. These include new features (Section B.1.1) and changes (Sections
B.1.2 - B.1.8).
B.1.1 ANSI-Mandated New Features
The following ANSI-mandated features are new to version 6.0:
■ The semantics for volatile have been implemented.
■ Both long and unsigned long values are allowed in switch expressions
and case constants.
■ The compiler supports unsigned long decimal constants. It is now
possible to initialize unsigned long variables with values larger than
MAX_LONG using decimal (rather than hexadecimal or octal) constants.
■ Bit fields are permitted in unions.
■ The address-of operator (&) works correctly on arrays and functions.
■ Storage classes or types (or both) are now required on variable
declarations. The compiler previously assumed that untyped variables
(such as a;) were integers. This declaration now generates a warning.
■ The LOCALE.H header file is new to version 6.0. It declares functions
and structures for describing conventions that vary from one country
to the next, such as the currency symbol and the way calendar dates
are printed.
B.1.2 Integer Promotion Rules
The ANSI draft standard requires a change in the evaluation of some
expressions that mix signed and unsigned integers. Earlier versions of the
compiler attempted to preserve an expression's unsigned nature as much as
possible. Version 6.0 attempts to preserve the expression's value.
In version 5.1, an unsigned char promotes to an unsigned int; an unsigned
int promotes to an unsigned long.
In version 6.0, an unsigned char promotes to a signed int; an unsigned int
promotes to a signed long.
For example,
main()
{
long int li = -256L;
test( li );
}
test( long li)
{
if( li < 0xffff )
puts( "C 6.0 does a signed compare" );
else puts( "C 5.1 does an unsigned compare" );
}
B.1.3 Defining NULL as a Pointer
The constant NULL is now defined as ((void *)0). Previous versions of
Microsoft C defined NULL as 0x0000 in small and medium models and
0x00000000L in compact and large models.
B.1.4 Shift Operators
Shift operators now give a result that is of the same type as the left side.
For example,
short si;
long li;
si = 0x0001;
li = si << 16L;
The compiler previously yielded a result that was the size of the largest of
the two values. In the example above, the short value would be automatically
cast to a long because 16L is long. The value assigned to li would be
0x00010000L in Microsoft C 5.1.
To adhere to the ANSI draft standard, Microsoft C 6.0 maintains the size of
the left operand. The variable si has 16 bits. Shifting left 16 times
produces a value of 0, which is then assigned to li.
B.1.5 Pointers to Typedefs
The rules for handling pointers to typedefs have changed subtly. For
example, C 5.1 interprets
typedef int far f_int;
f_int *fp_i;
as being equivalent to
int *far fp_i;
which means fp_i is a distant pointer to an integer. The address of fp_i
contains 32 bits. The size of the integer's address is indeterminate.
C 6.0 interprets it as
int far *fp_i;
This means fp_i is a far pointer to an integer. The address of the integer
contains 32 bits. The size of the address of fp_i is indeterminate.
This affects typedefs containing _near, _far, _based, and other modifiers.
Although these are Microsoft-specific keywords, their new behavior is
consistent with what the ANSI draft standard requires for the const and
volatile keywords.
B.1.6 Identifying Nonstandard Keywords
The following modifiers are specific to Microsoft C; they are not described
in the ANSI draft standard. To identify these implementation-defined
keywords as non-ANSI, an initial underscore has been added.
C 5.1 Keyword C 6.0 Keyword
────────────────────────────────────────────────────────────────────────────
far _far
huge _huge
near _near
cdecl _cdecl
fortran _fortran
interrupt _interrupt
pascal _pascal
The compiler still accepts the obsolescent versions of these keywords,
unless the /Za option is used.
B.1.7 Trigraphs
To maintain compatibility with and portability to other systems, Microsoft C
6.0 supports the following trigraphs:
Trigraph Character
────────────────────────────────────────────────────────────────────────────
??= #
??( [
??/ \
??) ]
??' ^
??< {
??! |
??> }
??- ~
B.1.8 ANSI Nonconformance
This section lists the areas where Microsoft C 6.0 does not conform to the
ANSI draft standard.
■ Microsoft C does not support multibyte characters, wide-character and
string constants, and the related library functions and types.
■ Microsoft C contains some name-space violations in the language
(extended keywords, such as near and far) and in the library (non-ANSI
macros and types in header files and extended library function names,
such as read and write).
B.2 New Keywords and Functions
This section describes keywords and functions that did not exist in previous
versions of Microsoft C. Details about how to use these features can be
found elsewhere in the documentation.
B.2.1 In-Line Assembler
The new _asm keyword allows you to mix assembly instructions with C source
code. This feature includes the _emit function, which lets you enter
arbitrary values into the code stream.
See Chapter 3, "Using the In-Line Assembler."
B.2.2 Based Pointers and Objects
A based pointer is a special, compact form of pointer. It is always
represented as a short offset. The address represented by such a pointer is
calculated by adding the based pointer to its base. The base must be
supplied each time the pointer is dereferenced, either explicitly using a
special operator or implicitly by associating the base value with the
pointer when it is declared. The base can be a far pointer, a near pointer,
or a new type that represents a segment.
Based pointers and objects are declared using the new keyword, _based.
Segment Types
The new type specifier, _segment, specifies a segment.
Any pointer or address can be cast to _segment. If the operand is a near
pointer, the result is the current value of the data segment register (DS).
If the operand is a far pointer, the result is the segment part of the far
pointer.
Segment Names
Segment names are declared using the built-in function _segname. The
compiler recognizes four predefined segment names: _CODE, _CONST, _DATA, and
_STACK.
Each segment name represents a constant of type _segment.
Base Operator
The base operator (:>) associates a base expression (usually a segment) with
a based pointer, to form a far pointer value. For example,
0x0F01:>0x0015
combines the segment 0x0F01 with the offset 0x0015 to form the effective
address 0x0F025. The base operator's precedence falls between ( ) and [ ].
Casting Based Pointers
A based pointer can be cast to a pointer, a long integer, a short integer,
or another based pointer. When a based pointer is converted to a far
pointer, a long integer, a near pointer, or another based pointer having a
different base expression, it is first normalized to a far pointer
(including adding the offset in the base, if present, to the based pointer);
then any additional conversions are applied.
Operations on Based Pointers
Based pointers, for the purpose of arithmetic and dereferencing, are treated
as semantically equivalent to far pointers. When a based pointer mixes with
another integral type (int, long, near pointer, far pointer, or based
pointer), implicit casting is done. In some cases, the compiler can optimize
these references and treat the pointer as an offset.
The value of 0 is treated specially, as it is for near and far pointers. No
conversions are applied to the constant 0 because it is assumed to be a null
pointer.
See Chapter 2, "Managing Memory."
B.2.3 Based Heap Allocation Support
The functions listed below provide support for allocating, expanding, and
freeing memory for based heaps, which dynamically allocate memory for based
items. The functions are prototyped in the MALLOC.H include file.
╓┌──────────┌───────────┌────────────────────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
_bcalloc _bheapchk _bmalloc
_bexpand _bheapmin _bmsize
_bfree _bheapseg _brealloc
_bfreeseg _bheapset
_bheapadd _bheapwalk
See Chapter 2, "Managing Memory."
B.2.4 Releasing Unused Heap Memory
The following routines release unused heap memory by shortening data
segments. MALLOC.H contains the function prototypes.
────────────────────────────────────────────────────────────────────────────
_fheapmin
_heapmin
_nheapmin
B.2.5 Making Static Data Available to the Heap
The _heapadd function is new. It allows the user to make unused static data
available to the heap.
B.2.6 Long Doubles
Microsoft C version 5.1 treated double and long double as syntactically
different types that were semantically equal. Both types were stored in
memory as 64-bit quantities. For purposes of type-checking, long double and
double have always been different types.
Because the 80x87 family of math coprocessors supports an 80-bit
floating-point type, Microsoft C version 6.0 stores long double variables in
the 80x87 10-byte (80-bit) form.
Certain functions have been modified to handle the long double type. The
printf and scanf family of functions supports long double values with the
trailing l. The library contains new versions of the transcendental
functions as well as intrinsic forms that accept long double arguments.
B.2.7 Long Double Functions
All the functions below are defined in the standard include file MATH.H.
They return long double values and results and error codes analogous to the
double versions.
╓┌───────┌───────┌───────────────────────────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
acosl expl _matherrl
asinl fabsl modfl
atanl floorl powl
atan2l fmodl sinl
_atold frexpl sinhl
cabsl hypotl sqrtl
ceill ldexpl tanl
cosl logl tanhl
coshl log10l
────────────────────────────────────────────────────────────────────────────
coshl log10l
B.2.8 Model-Independent String and Memory Functions
The following functions make it easier to write mixed-model programs by
providing model-independent (large model) forms for most of the standard
string and memory functions. These functions can be called from any point in
any program, no matter which memory model has been selected. These functions
take only far pointers as arguments. Thus, any data item, near or far, in
any combination, can be handled.
The names of these functions are the same as the model-dependent forms,
except they include an _f prefix. For example, _fstrlen is the
model-independent version of the strlen function.
The functions listed below are defined in the standard include file
STRING.H.
Memory Functions
╓┌────────────────────────────┌──────────────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
_fmemccpy _fmemcpy
_fmemchr _fmemmove
_fmemcmp _fmemset
_fmemicmp
String Functions
╓┌──────────┌───────────┌────────────────────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
_fstrcat _fstrlwr _fstrrchr
_fstrchr _fstrncat _fstrrev
_fstrcmp _fstrncmp _fstrset
_fstricmp _fstrnicmp _fstrspn
_fstrcpy _fstrncpy _fstrstr
────────────────────────────────────────────────────────────────────────────
_fstrcpy _fstrncpy _fstrstr
_fstrcspn _fstrnset _fstrtok
_fstrlen _fstrpbrk _fstrupr
String Duplication Functions
────────────────────────────────────────────────────────────────────────────
_fstrdup
_nstrdup
B.2.9 Mixed-Model Memory Allocation Support
The following functions are based on realloc, calloc, and expand, but they
affect only near memory or far memory. MALLOC.H contains the function
prototypes.
╓┌───────────────────────────┌───────────────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
_fcalloc _ncalloc
_fexpand _nexpand
_frealloc _nrealloc
B.2.10 The _fastcall Attribute (/Gr Option)
Individual function prototypes can be declared with the new attribute
_fastcall.
The /Gr option enables the fastcall function-calling convention for all
functions that are not explicitly prototyped with the _cdecl, _pascal, or
_fortran attributes. Using /Gr on the command line causes each function in
the module to compile as _fastcall unless the function is declared with a
conflicting attribute, or the name of the function is main.
When you use the /Gr option, all functions are assumed to use the _fastcall
convention. As a result, to use any run-time library functions, you must
either include the standard include files or explicitly prototype the
function you want to call.
A fastcall function receives up to three 16-bit arguments, passed in
registers rather than on the stack. Arguments are passed in the AX, BX, and
DX registers. This may change in future versions of the compiler.
The argument types and their potential register assignments are
Argument Registers
────────────────────────────────────────────────────────────────────────────
character (3) AL, DL, BL
short integer (3) AX, DX, BX
near pointer (3) BX, AX, DX
long integer (1) DX:AX
far pointer (1) ES:BX
If the registers for a particular class have already been used, or if an
argument is not one of the five types listed above, it is pushed on the
stack as usual. An argument list of types long, float, short would pass the
long in DX:AX, push the float, and pass the short in BX.
The treatment of character arguments depends further on prototypes. If there
is no prototype, the argument is promoted to short and the rules for short
integers apply. Only if the argument is prototyped as a char do the
character rules apply.
The _fastcall convention is not compatible with any of the following
attributes: _interrupt, _saveregs, _export, _cdecl, _fortran, or _pascal.
See Chapter 1, "Optimizing C Programs."
B.2.11 Drive and Directory Functions
Several new functions make it easier to get and set the current drive and
the current directory. The prototypes for the following routines are in
DIRECT.H:
────────────────────────────────────────────────────────────────────────────
_chdrive
_fullpath
_getdrive
_getdcwd
B.2.12 Text Output Functions for OS/2
Several text-mode screen functions have been added to Microsoft C 6.0 for
OS/2. With the exception of the new _scrolltextwindow function, they are
identical to what is defined in real mode, except for any references to
behavior in graphics modes. The following routines are located in
GRTEXT.LIB, and the prototypes are in GRAPH.H:
╓┌─────────────────┌─────────────────┌───────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
_clearscreen _getvideoconfig _settextrows
_displaycursor _outtext _settextwindow
_getbkcolor _setbkcolor _setvideomode
_gettextcolor _settextcolor _setvideomoderows
────────────────────────────────────────────────────────────────────────────
_gettextcolor _settextcolor _setvideomoderows
_gettextcursor _settextcursor _scrolltextwindow
_gettextposition _settextposition _wrapon
See Part 4 of this manual, "OS/2 Support."
B.3 New Features
The features described in Sections B.3.1-B.3.10 are new to version 6.0.
B.3.1 Strings and Macros
The compiler now allows longer string literals (up to 4K) and longer macro
expansions (up to 6K).
B.3.2 CL Options
The following options are new to Microsoft C 6.0:
Option Action
────────────────────────────────────────────────────────────────────────────
/AT Compiles in tiny model (.COM files).
/Fr« filename» Outputs source browser information file.
/FR« filename» Outputs extended source browser
information file.
/Gd Forces _cdecl calling conventions.
/Gr Enables register (_fastcall)
function-calling
conventions.
/MAmasmoption Supports invocation of the assembler
using the CL driver. All MASM-supported
options are accepted. In addition, the
compiler recognizes file names with .ASM
suffixes and passes them directly to
MASM.
/MD- Uses C run-time as DLL option. Defaults
to
/ALw /FPi /G2 /DDLL /DMT and inhibits
library search records.
/ML Links C run-time as part of a
dynamic-link library (DLL). Defaults to
/ALw /FPa /G2 /DMT and changes library
search record to LLIBCDLL.LIB.
/MT Enables multithread option. Defaults to
/ALw /FPi /G2 /DMT and changes library
search record to LLIBCMT.LIB.
/Oe Enables global register allocation.
/Og Enables global optimizations and global
common subexpressions (CSEs).
/Ox Is now equivalent to /Ocegilt /Gs. Note
that this implies that maximum
optimization includes the _fastcall
function-calling convention.
/Oz Enables aggressive optimizations.
/Ta name Specifies that name is to be treated as
an assembler input file.
/W4 Turns on extra warning level which
supports more detailed (LINT-like)
warnings and recognition of ANSI
violations.
/WX Causes warnings to be treated as errors.
If a warning occurs, the .OBJ file is
not created.
B.3.3 Tiny Memory Model (.COM Files)
Microsoft C 6.0 now supports the tiny memory model, which produces .COM
rather than .EXE files (for DOS only).
The /AT option selects the tiny model. This forces the linker to use options
/NOE and /TINY. Within the linker, /TINY turns on /FARCALLTRANSLATION to
help eliminate far segment relocations. If you link your own .OBJ files,
link with CRTCOM.OBJ.
B.3.4 The Optimize Pragma
The optimize pragma turns optimizing options on or off:
#pragma optimize("<optimization switch list>",{off|on})
where <optimization switch list> can be an empty list or one or more of the
following: a, c, e, g, l, w, n, p, t, and z. For example,
#pragma optimize("lp",on) /* equivalent to /Olp */
#pragma optimize("",off) /* turns off all optimization */
#pragma optimize("",on) /* restores default settings */
See Chapter 1, "Optimizing C Programs."
B.3.5 Nameless Structures and Unions
Both struct and union declarations can now be specified without a declarator
when they are members of another structure or union.
A nameless union would look like this:
struct str
{
int a,b;
union /* unnamed union */
{
char c[4];
long l;
float f;
};
char c_array[10];
} my_str;
.
.
.
my_str.l == 0L;
A nameless structure would look like this:
struct s1
{
int a,b,c;
};
struct s2
{
float y;
struct s1;
char str[10];
} *p_s2;
.
.
.
p_s2->b = 100;
B.3.6 Unsized Arrays as the Last Member of a Structure
The compiler now allows an unsized or zero-sized array as the last member of
a structure. The declaration of such a structure would look like this:
struct var_length
{
<set of declarations>;
<type> array[];
};
Unsized arrays can appear only as the last member of a structure. Structures
containing unsized array declarations can be nested within other structures
as long as no further members are declared in any enclosing structures.
Arrays of such structures are not allowed.
The sizeof operator, when applied to a variable of this type or to the type
itself, assumes 0 for the size of the array.
B.3.7 Improved Warnings
A new warning level four (CL option /W4) has been added for the following
warnings:
■ Detection of unused global variables
■ Expressions without side effects
■ Nonportable (non-ANSI) constructs
■ Local variable referenced before being initialized
■ Undefined or implementation-defined constructs
B.3.8 Macros
The number of macros definable with /D options has increased from 20 to 30.
B.3.9 Improved Multithread Support in OS/2
The number of OS/2 threads supported at run time has increased from 32 to
the operating system limit. Three new options aid development of multithread
applications and dynamic-link libraries:
1. /MT for building multithread programs. It implies /ALw /FPi /G2 /D MT,
and changes the library search record emitted in the object file to
reference LLIBCMT.
2. /ML for building a DLL that uses the C run-time library. It implies
/ALw /FPa /G2 /D MT, and changes the library search record emitted in
the object file to reference LLIBCDLL.
3. /MD for building .EXE files and DLLs that share a C run-time DLL. It
implies /ALw /FPi /G2 /DDLL /D MT, and no library search records are
emitted in the object file.
B.3.10 Pipe Support in OS/2
Microsoft C 6.0 supports pipes as part of the file I/O system. The functions
listed below are defined in the standard include file IO.H:
────────────────────────────────────────────────────────────────────────────
_pipe
_popen
_pclose
B.4 Differences in Code Generation
This section lists ways in which the executable files produced by Microsoft
C 6.0 may differ from the files produced by previous versions of the
compiler.
B.4.1 Speed and Space Improvements
Executable files are smaller and faster.
B.4.2 Code Quality
Microsoft C 6.0 generates improved local code in default optimization cases
and, under full optimization, supports global (function level) register
allocation and common subexpressions (CSEs), loop optimizations, parameter
passing through registers, and generation of in-line code for certain
intrinsic functions.
B.4.3 Floating-Point Code Generation
In Microsoft C 6.0, the /FPi87 option suppresses the fixups previously used
for emulation. Pure coprocessor instructions are now emitted. This makes
object files smaller and speeds up linking, in addition to making in-line
assembly easier to use.
In version 5.1, /FPi and /FPi87 generated the same code; the only difference
was the library. In C 6.0, the two options generate different code. It is no
longer possible to force /FPi87 to act like /FPi. If you use /FPi87, the
math coprocessor must be in the computer on which the program is running.
Note that if you use /FPi87 you must link with mLIB7, not mLIBCE.
B.4.4 Intrinsic Functions
The intrinsic function optimization option (/Oi) causes the compiler to
generate in-line code for the following functions:
╓┌─────────┌───────┌─────────────────────────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
abs _lrotl _rotl
_disable _lrotr _rotr
_enable memcmp strcat
ffabs memcpy strcmp
inp memset strcpy
inpw outp strlen
labs outpw strset
The compiler does not generate in-line code for the following functions,
although it will modify the calling convention to pass the arguments on the
floating-point chip:
╓┌──────┌───────┌────────────────────────────────────────────────────────────╖
────────────────────────────────────────────────────────────────────────────
acos pow coshl
asin sin expl
atan sinh floorl
atan2 sqrt fmodl
ceil tan logl
────────────────────────────────────────────────────────────────────────────
ceil tan logl
cos tanh log10l
cosh acosl powl
exp asinl sinl
floor atanl sinhl
fmod atan2l sqrtl
log ceill tanl
log10 cosl tanhl
B.5 Changes and Deletions
The changes and deletions listed in this section have a high probability of
affecting existing programs.
B.5.1 Deleted Features
The data_seg pragma has been deleted.
The memory management routine sbrk has been deleted.
The compiler and tools do not run under DOS 2.1. The run-time files produced
by the compiler and linker will continue to run under DOS 2.1.
B.5.2 Evaluation of Real Expressions
Real expressions inside parentheses are now evaluated according to the
semantics of the parentheses. For example, in the expression
((r1 / r2) * r3)
the division is performed before the multiplication. Previous versions of
the compiler might have reordered the operations.
B.5.3 Default Optimizations
Version 6.0 performs more extensive optimizations than version 5.1. This
implies that code that had aliasing but worked with the /Oa option in 5.1
might not work with version 6.0 and /Oa. Also, because of the improved
optimizations, the /Od option should be used to turn off all optimizing
before you begin debugging with CodeView.
B.5.4 Sign Extension of char Arguments
Previous versions of Microsoft C would sign-extend char arguments to int
size before passing them to a second function. Version 6.0 does not extend
the sign if the function is prototyped and the prototype includes a char
argument. The most-significant byte is considered undefined.
B.5.5 Conditional Compilation and Signed Values
Version 5.1 of Microsoft C treated conditional compilation expressions as
signed long values. Version 6.0 evaluates these expressions using the same
rules as expressions in C. For example,
#if 0xFFFFFFFFL > 1UL
.
.
.
#endif
The expression evaluates to be true. It was evaluated as false in version
5.1.
B.5.6 The const and volatile Qualifiers
The const and volatile qualifiers must be placed after the type they
qualify. The declaration
int (const *p);
is now treated as a syntax error. Previous versions of the compiler would
accept such a construction.
The following declarations are legal:
int const *p_ci; /* pointer to constant int */
int const (*p_ci); /* pointer to constant int */
int *const cp_i; /* constant pointer to int */
int (*const cp_i); /* constant pointer to int */
B.5.7 Memory Allocation
The _fmalloc function attempts to allocate far memory. It previously called
_nmalloc if far memory was not available. Now it returns a null pointer if
far memory isn't available, even if near memory is available.
B.5.8 Memory Used by Command-Line Arguments
Previous versions of the compiler placed the command-line argument strings
and environment strings in the near heap. Now they are allocated though
malloc, which means that they will be in far memory in compact and large
models.
B.5.9 Format Specifiers in printf
The printf format specifier modifiers N, F, h, and l have changed.
The specifier %Np is a synonym for %hp, but the latter is preferred.
Likewise, %Fp is a synonym for %lp.
For scanf, N and F refer to the distance to the object being read in; that
is, whether the pointer itself is allocated near or far. The modifiers h and
l refer to the size of the object (16-bit near pointer or 32-bit far
pointer). In these examples,
scanf("%Nlp", n_fp);
scanf("%Fhp", f_np);
the first line reads in an address that resides in near memory (N) but holds
a 32-bit far pointer variable (lp). The second line reads in a near pointer
value (hp) into a pointer variable that resides in far memory (F).
B.5.10 Functions that Return Float Values
In Microsoft C 5.1, a prototype or definition such as
float funcname();
was interpreted as
double funcname()
Version 6.0 interprets it as
float
Appendix C Implementation-Defined Behavior
────────────────────────────────────────────────────────────────────────────
The American National Standards Institute (ANSI) Standard for the C
programming language contains an appendix called "Portability Issues." The
ANSI appendix lists areas of the C language that ANSI leaves open to each
particular implementation. This appendix describes how Microsoft C handles
these implementation-defined areas of the C language.
This appendix follows the same order as the ANSI Standard appendix. Each
item covered includes references to the ANSI chapter and section that
explains the implementation-defined behavior.
────────────────────────────────────────────────────────────────────────────
NOTE
This appendix describes the U.S. English-language version of the C compiler
only. Foreign-language implementations of Microsoft C may differ slightly.
────────────────────────────────────────────────────────────────────────────
C.1 Translation
C.1.1 Diagnostics
How a diagnostic is identified (§2.1.1.3)
Microsoft C produces error messages in the form:
filename(line-number) : diagnostic Cnumber message
where filename is the name of the source file in which the error was
encountered; line-number is the line number at which the compiler detected
the error; diagnostic is either "error" or "warning"; number is a unique
four-digit number (preceded by a C) that identifies the error or warning;
message is an explanatory message.
C.2 Environment
C.2.1 Arguments to main
The semantics of the arguments to main (§2.1.2.2)
In Microsoft C, the function called at program start-up is called main.
There is no prototype declared for main, and it can be defined with zero,
two, or three parameters:
int main( void )
int main( int argc, char *argv[] )
int main( int argc, char *argv[], char *envp[] )
The third line above, where main accepts three parameters, is a Microsoft
extension to the ANSI Standard. The third parameter, envp, is an array of
pointers to environment variables. The envp array is terminated by a null
pointer. See on-line help for more information about main and envp.
The variable argc never holds a negative value.
The array of strings ends with argv[argc], which contains a null pointer.
All elements of the argv array are pointers to strings.
A program invoked with no command-line arguments will receive a value of one
for argc, as the name of the executable file is placed in argv[0]. (In DOS
versions prior to 3.0, the executable file name is not available. The letter
"C" is placed in argv[0].) Strings pointed to by argv[1] through argv[argc -
1] represent program parameters.
The parameters argc and argv are modifiable and retain their last-stored
values between program start-up and program termination.
C.2.2 Interactive Devices
What constitutes an interactive device (§2.1.2.3)
Microsoft C defines the keyboard and the display as interactive devices.
C.3 Identifiers
C.3.1 Significant Characters without External Linkage
The number of significant characters without external linkage (§3.1.2)
Identifiers are significant to 31 characters. The compiler does not restrict
the number of characters you can use in an identifier; it simply ignores any
characters beyond the limit.
C.3.2 Significant Characters with External Linkage
The number of significant characters with external linkage (§3.1.2)
Identifiers declared extern in programs compiled with Microsoft C are
significant to 31 characters. You can modify this default to a smaller
number using the /H (restrict length of external names) option. See on-line
help for more information on the syntax of the /H option.
C.3.3 Upper- and Lowercase
Whether case distinctions are significant (§3.1.2)
Microsoft C treats identifiers within a compilation unit as case sensitive.
Externally linked identifiers may or may not be case sensitive, depending on
whether you use /NOIGNORECASE option when you invoke the linker. The default
for the linker is to ignore case, making externally linked identifiers case
insensitive.
Thus, symbols in source files are sensitive to case. By default, symbols in
object files are not.
Two CL command-line options affect case sensitivity:
1. The /Gc (generate Pascal-style function calls) command-line option
converts all external identifiers (including function names) to
uppercase.
The _pascal declarator performs the same operation on a
function-byfunction basis.
2. The /Zc (compile case insensitive) converts all identifiers (excluding
function names) to uppercase.
C.4 Characters
C.4.1 The ASCII Character Set
Members of source and execution character sets (§2.2.1)
The source character set is the set of legal characters that can appear in
source files. For Microsoft C, the source character set is the standard
ASCII character set. Figure C.1 contains an ASCII table.
────────────────────────────────────────────────────────────────────────────
WARNING
Because keyboard and console drivers can remap the character set, programs
intended for international distribution should check the country code.
────────────────────────────────────────────────────────────────────────────
C.4.2 Multibyte Characters
Shift states for multibyte characters (§2.2.1)
Multibyte characters are used by some implementations to represent
foreignlanguage characters not represented in the base character set.
Microsoft C 6.0 does not support multibyte characters.
C.4.3 Bits per Character
Number of bits in a character (§2.2.4.2)
The number of bits in a character is represented by the manifest constant
CHAR_BIT. The LIMITS.H file defines CHAR_BIT as 8.
C.4.4 Character Sets
Mapping members of the source character set (§3.1.3.4)
The source character set and execution character set include the ANSI ASCII
characters listed in Table C.1. Escape sequences are also shown in Table
C.1.
Table C.1
╓┌────────────────┌─────────────────┌────────────────────────────────────────╖
Escape Sequence Character ASCII Value
────────────────────────────────────────────────────────────────────────────
\a Alert/bell 7
\b Backspace 8
\f Form feed 12
\n Newline 10
\r Carriage return 13
\t Horizontal tab 9
\v Vertical tab 11
\" Double quotation 34
\' Single quotation 39
\\ Backslash 92
────────────────────────────────────────────────────────────────────────────
Escape Sequence Character ASCII Value
────────────────────────────────────────────────────────────────────────────
────────────────────────────────────────────────────────────────────────────
C.4.5 Unrepresented Character Constants
The value of an integer character constant that contains a character or
escape sequence not represented in the basic execution character set or the
extended character set for a wide character constant (§3.1.3.4)
Microsoft C does not support wide characters.
C.4.6 Wide Characters
The value of an integer character constant that contains more than one
character or a wide character constant that contains more than one multibyte
character (§3.1.3.4)
Microsoft C does not support wide characters or multibyte characters.
C.4.7 Converting Multibyte Characters
The current locale used to convert multibyte characters into corresponding
wide characters (codes) for a wide character constant (3.1.3.4)
Microsoft C does not support multibyte characters.
C.4.8 Range of char Values
Whether a "plain" char has the same range of values as a signed char or an
unsigned char (§3.2.1.1)
All character values range from 0x00 to 0xFF, signed or unsigned. If a char
is not explicitly marked as signed or unsigned, it defaults to the signed
type.
The CL option /J changes the default from signed to unsigned.
C.5 Integers
C.5.1 Range of Integer Values
The representations and sets of values of the various types of integers
(§3.1.2.5)
Short integers contain 16 bits (two bytes). Long integers contain 32 bits
(four bytes). Signed integers are represented in two's-complement form. The
mostsignificant bit holds the sign: 1 for negative, 0 for positive and zero.
The values are listed below:
Type Minimum and Maximum
────────────────────────────────────────────────────────────────────────────
unsigned short 0 to 65535
signed short -32768 to 32767
unsigned long 0 to 4294967295
signed long -2147483648 to 2147483647
C.5.2 Demotion of Integers
The result of converting an integer to a shorter signed integer, or the
result of converting an unsigned integer to a signed integer of equal
length, if the value cannot be represented (§3.2.1.2)
When a long integer is cast to a short, or a short is cast to a char, the
least significant bytes are retained.
For example, this line
short x = (short)0x12345678L;
assigns the value 0x5678 to x, and this line
char y = (char)0x1234;
assigns the value 0x34 to y.
When signed variables are converted to unsigned and vice versa, the bit
patterns remain the same. For example, casting -2 (0xFE) to an unsigned
value yields 254 (also 0xFE).
C.5.3 Signed Bitwise Operations
The results of bitwise operations on signed integers (§3.3)
Bitwise operations on signed integers work the same as bitwise operations on
unsigned integers. For example, -16 & 99 can be expressed in binary as
11111111 11110000
& 00000000 01100011
-----------------
00000000 01100000
The result of the bitwise AND is 96.
C.5.4 Remainders
The sign of the remainder on integer division (§3.3.5)
The sign of the remainder is the same as the sign of the dividend. For
example,
50 / -6 == -8
50 % -6 == 2
-50 / 6 == -8
-50 % 6 == -2
C.5.5 Right Shifts
The result of a right shift of a negative-value signed integral type
(§3.3.7)
Shifting a negative value to the right yields half the absolute value,
rounded down. For example, -253 (binary 11111111 00000011) shifted right one
bit produces -127 (binary 11111111 10000001). A positive 253 shifts right to
produce +126.
Right shifts preserve the sign bit. When a signed integer shifts right, the
mostsignificant bit remains set. When an unsigned integer shifts right, the
mostsignificant bit is cleared. Thus, if 0xF000 is signed, a right shift
produces 0xF800. If 0xF000 is unsigned, the result is 0x7800.
Shifting a positive number right sixteen times produces 0x0000. Shifting a
negative number right sixteen times produces 0xFFFF.
C.6 Floating-Point Math
C.6.1 Values
The representations and sets of values of the various types of
floating-point numbers (§3.1.2.5)
The float type contains 32 bits: 1 for the sign, 8 for the exponent, and 23
for the mantissa. Its range is +/- 3.4E38 with at least 7 digits of
precision.
The double type contains 64 bits: 1 for the sign, 11 for the exponent, and
52 for the mantissa. Its range is +/- 1.7E308 with at least 15 digits of
precision.
The long double type is new to Version 6.0 of Microsoft C. It contains 80
bits: 1 for the sign, 15 for the exponent, and 64 for the mantissa. Its
range is +/- 1.2E4932 with at least 17 digits of precision.
C.6.2 Casting Integers to Floating-Point Values
The direction of truncation when an integral number is converted to a
floating-point number that cannot exactly represent the original value
(§3.2.1.3)
When an integral number is cast to a floating-point value that cannot
exactly represent the value, the value is rounded (up or down) to the
nearest suitable value.
For example, casting an unsigned long (with 32 bits of precision) to a float
(whose mantissa has 23 bits of precision) rounds the number to the nearest
multiple of 256. The long values 4294966913 - 4294967167 are all rounded to
the float value 4294967040.
C.6.3 Truncation of Floating-Point Values
The direction of truncation or rounding when a floating-point number is
converted to a narrower floating-point number (§3.2.1.4)
When an underflow occurs, the value of a floating-point variable is rounded
down to zero. An overflow causes a run-time math error.
C.7 Arrays and Pointers
C.7.1 Largest Array Size
The type of integer required to hold the maximum size of an array─that is,
the size of size_t (§3.3.3.4, 4.1.1)
The size_t typedef is an unsigned short, with the range 0x0000 to 0xFFFF.
Huge arrays can exceed this limit if they contain more than 65,535 elements.
Arithmetic operations on huge arrays should therefore cast size_t and the
results of an arithmetic operations on pointers to unsigned long.
C.7.2 Casting Pointers
The result of casting a pointer to an integer or vice versa (§3.3.4)
Near pointers are the same size as short integers; casting near to short (or
short to near) has no immediate effect on the value.
Far pointers and huge pointers are the same size as long integers. Casting
far/huge to long (or long to far/huge) has no immediate effect on the value.
When a near pointer is cast to a long, the 16-bit value is "normalized,"
which means the segment (usually DS) and offset are combined to produce a
32-bit memory location.
When a far or huge pointer is cast to a short, the long value is truncated
to a short.
The compiler normalizes based pointers when necessary, unless the based
pointer is a constant zero, in which case it is assumed to be a null
pointer. See Chapter 13, "Writing Portable Programs," for more information
about based pointers.
C.7.3 Pointer Subtraction
The type of integer required to hold the difference between two pointers to
elements of the same array, ptrdiff_t (§3.3.6, 4.1.1)
A ptrdiff_t is a signed integer in the range -32768 to 32767, with one
exception. Because huge pointers can address more than 64K of memory,
subtracting one huge pointer from another can yield a result that is a long
integer. The result of subtracting two huge pointers should be cast to a
long.
The compiler normalizes based pointers when necessary. In most cases, based
pointers are treated as far pointers.
C.8 Registers
C.8.1 Availability of Registers
The extent to which objects can actually be placed in registers by use of
the register storage-class specifier (§3.5.1)
Two registers, SI and DI, are available in Microsoft C. Register variables
with a type that has 16 bits may be allocated in these registers.
C.9 Structures, Unions, Enumerations, and Bit Fields
C.9.1 Improper Access to a Union
A member of a union object is accessed using a member of a different type
(§3.3.2.3)
If a union of two types is declared and one value is stored, but the union
is accessed with the other type, the results are unreliable.
For example, a union of float and int is declared. A float value is stored,
but the program later accesses the value as an int. In such a situation, the
value would depend on the internal storage of float values. The integer
value would not be reliable.
C.9.2 Sign of Bit Fields
Whether a "plain" int field is treated as a signed int bit field or as an
unsigned int bit field (§3.5.2.1)
Bit fields can be signed or unsigned. Plain bit fields are treated as
signed.
C.9.3 Storage of Bit Fields
The order of allocation of bit fields within an int (§3.5.2.1)
Bit fields are allocated within a 16-bit integer from least-significant to
mostsignificant bit. In the following code,
struct mybitfields
{
unsigned a : 4;
unsigned b : 5;
unsigned c : 7;
} test;
void main( void )
{
test.a = 2;
test.b = 31;
test.c = 0;
}
the bits would be arranged as follows:
00000001 11110010
cccccccb bbbbaaaa
Since the 80x86 processors store the low byte of integer values before the
high byte, the integer 0x01F2 above would be stored in physical memory as
0xF2 followed by 0x01.
C.9.4 Alignment of Bit Fields
Whether a bit field can straddle a storage-unit boundary (§3.5.2.1)
Bit fields default to size short, which can cross a byte boundary (see
Section C.9.3 above) but not a 16-bit boundary. If the size and location of
a bit field would cause it to overflow the current integer, the field is
moved to the beginning of the next available integer.
If a bit field is declared as a long, it can hold up to 32 bits.
In either case, an individual field cannot cross a 16- or 32-bit boundary.
C.9.5 The enum Type
The integer type chosen to represent the values of an enumeration type
(§3.5.2.2)
A variable declared as enum is a signed short integer.
C.10 Qualifiers
C.10.1 Access to Volatile Objects
What constitutes an access to an object that has volatile-qualified type
(§3.5.3)
Any reference to a volatile-qualified type is an access.
C.11 Declarators
C.11.1 Maximum Number
The maximum number of declarators that can modify an arithmetic, structure,
or union type (§3.5.4)
Microsoft C does not limit the number of declarators. The number is limited
only by available memory.
C.12 Statements
C.12.1 Limits on Switch Statements
The maximum number of case values in a switch statement (§3.6.4.2)
Microsoft C does not limit the number of case values in a switch statement.
The number is limited only by available memory.
C.13 Preprocessing Directives
C.13.1 Character Constants and Conditional Inclusion
Whether the value of a single-character character constant in a constant
expression that controls conditional inclusion matches the value of the same
character constant in the execution character set. Whether such a character
constant can have a negative value (§3.8.1)
The character set used in preprocessor statements is the same as the
execution character set. The preprocessor recognizes negative character
values.
C.13.2 Including Bracketed File Names
The method for locating includable source files (§3.8.2)
The preprocessor first searches the directories specified by the CL option
/I. If the /I option is not present or if it fails, the preprocessor uses
the INCLUDE environment variable to find any include files within angle
brackets. If more than one directory appears as part of the /I option or
within the INCLUDE variable, the preprocessor searches them in the order
they appear.
For example, the command
CL /ID:\MSC\INCLUDE MYPROG.C
causes the preprocessor to search the directory D:\MSC\INCLUDE for include
files such as STDIO.H.
The commands
SET INCLUDE = D:\MSC\INCLUDE
CL MYPROG.C
have a similar effect.
If both sets of searches fail, a fatal error is generated.
C.13.3 Including Quoted File Names
The support for quoted names for includable source files (§3.8.2)
If the file name is fully specified, with a path that includes a colon (for
example, F:\C6\SPECIAL\INCL\ORANGE.H), the preprocessor follows the path.
If the file name is not fully specified, the preprocessor searches the
directory of the file that included it. If the file is not found there, the
preprocessor searches the parent directory, the parent's parent, and so on,
terminating with the root directory.
If the include file is not found in any of those directories, the rules for
bracketed file names apply.
C.13.4 Character Sequences
The mapping of source file character sequences (§3.8.2)
Preprocessor statements use the same character set as source file statements
with the exception that escape sequences are not supported.
Thus, to specify a path for an include file, use only one backslash:
#include "path1\path2\myfile"
Within source code, two backslashes are necessary:
fil = fopen( "path1\\path2\\myfile", "rt" );
C.13.5 Pragmas
The behavior on each recognized #pragma directive (§3.8.6)
The following pragmas are defined in the Microsoft C Reference:
#pragma alloc_text #pragma optimize
#pragma check_pointer #pragma pack
#pragma check_stack #pragma page
#pragma comment #pragma pagesize
#pragma function #pragma same_seg
#pragma intrinsic #pragma skip
#pragma linesize #pragma subtitle
#pragma loop_opt #pragma title
#pragma message
C.13.6 Default Date and Time
The definitions for _DATE_ and _TIME_ when, respectively, the date and time
of translation are not available (§3.8.8)
When a hardware clock is not accessible, the default values for _DATE_ and
_TIME_ are Friday, May 3, 1957 and 5:00 PM.
C.14 Library Functions
C.14.1 NULL Macro
The null pointer constant to which the macro NULL expands (§4.1.5)
Several include files define the NULL macro as ((void *)0).
C.14.2 Diagnostic Printed by the assert Function
The diagnostic printed by and the termination behavior of the assert
function (§4.2)
The assert function prints a diagnostic message and calls the abort routine
if the expression is false (0). The diagnostic message has the form
Assertion failed: [expression], file [filename], line [linenumber]
where filename is the name of the source file and linenumber is the line
number of the assertion that failed in the source file. No action is taken
if expression is true (nonzero).
C.14.3 Character Testing
The sets of characters tested for by the isalnum, isalpha, iscntrl, islower,
isprint, and isupper functions (§4.3.1)
Function Tests For
────────────────────────────────────────────────────────────────────────────
isalnum Characters 0 - 9, A-Z, a-z
ASCII 48-57, 65-90, 97-122
isalpha Characters A-Z, a-z
ASCII 65-90, 97-122
iscntrl ASCII 0 -31, 127
islower Characters a-z
ASCII 97-122
isprint Characters A-Z, a-z, 0 - 9, punctuation,
space
ASCII 32-126
isupper Characters A-Z
ASCII 65-90
C.14.4 Domain Errors
The values returned by the mathematics functions on domain errors (§4.5.1)
The ERRNO.H file defines the domain error constant EDOM as 33.
C.14.5 Underflow of Floating-Point Values
Whether the mathematics functions set the integer expression errno to the
value of the macro ERANGE on underflow range errors (§4.5.1)
A floating-point underflow does not set the expression errno to ERANGE. When
a value approaches zero and eventually underflows, the value is set to zero.
C.14.6 The fmod Function
Whether a domain error occurs or zero is returned when the fmod function has
a second argument of zero (§4.5.6.4)
When the fmod function has a second argument of zero, the function returns
zero.
C.14.7 The signal Function
The set of signals for the signal function (§4.7.1.1)
The first argument passed to signal must be one of the symbolic constants
listed below. The constants are defined in SIGNAL.H. Also listed is the
operating mode support for each signal.
Signal Argument Description
────────────────────────────────────────────────────────────────────────────
SIGABRT Abnormal termination (real and protected
mode).
SIGBREAK CTRL+BREAK signal. Terminates the
calling program (protected mode only).
SIGFPE Floating-point error, such as overflow,
division by zero, or invalid operation.
Terminates the calling program (real and
protected mode).
SIGILL Illegal instruction. Terminates the
calling program (protected mode only).
SIGINT CTRL+C interrupt. Issues INT 23H (real
and
protected mode).
SIGSEGV Illegal storage access. Not generated by
DOS or OS/2, but supported for ANSI
compatibility. Terminates the calling
program (real and protected mode).
SIGTERM Termination request sent to the program.
Not generated by DOS or OS/2, but
supported for ANSI compatibility.
Terminates the calling program (real and
protected mode).
SIGUSR1 OS/2 process flag A (protected mode
only).
SIGUSR2 OS/2 process flag B (protected mode
only).
SIGUSR3 OS/2 process flag C (protected mode
only).
C.14.8 Default Signals
If the equivalent of signal (sig, SIG_DFL) is not executed prior to the call
of a signal handler, the blocking of the signal that is performed (§4.7.1.1)
Signals are set to their default status when a program begins running.
C.14.9 The SIGILL Signal
Whether the default handling is reset if the SIGILL signal is received by a
handler specified to the signal function (§4.7.1.1)
The SIGILL signal applies to OS/2 applications only. When SIGILL is
received, the signal handling is not reset to the default SIG_DFL.
C.14.10 Terminating Newline Characters
Whether the last line of a text stream requires a terminating newline
character (§4.9.2)
Stream functions recognize either newline or end-of-file as the terminating
character for a line.
C.14.11 Blank Lines
Whether space characters that are written out to a text stream immediately
before a newline character appear when read in (§4.9.2)
Space characters are preserved.
C.14.12 Null Characters
The number of null characters that can be appended to data written to a
binary stream (§4.9.2)
Any number of null characters can be appended to a binary stream.
C.14.13 File Position in Append Mode
Whether the file position indicator of an append mode stream is initially
positioned at the beginning or end of the file (§4.9.3)
When a file is opened in append mode, the file position indicator initially
points to the end of the file.
C.14.14 Truncation of Text Files
Whether a write on a text stream causes the associated file to be truncated
beyond that point (§4.9.3)
Writing to a text stream does not truncate the file beyond that point.
C.14.15 File Buffering
The characteristics of file buffering (§4.9.3)
Disk files accessed through standard I/O functions are fully buffered. By
default, the buffer holds 512 bytes. Some of the low-level DOS and BIOS
functions (all of which are non-ANSI) are unbuffered.
C.14.16 Zero-Length Files
Whether a zero-length file actually exists (§4.9.3)
Files with a length of zero are permitted.
C.14.17 File Names
The rules for composing valid file names (§4.9.3)
A file specification can include an optional drive letter (always followed
by a colon), a series of optional directory names (separated by
backslashes), and a file name.
File names and directory names can contain up to eight characters followed
by a period and a three-character extension. Case is ignored. The wild-card
characters * and ? are not permitted within the name or extension.
C.14.18 File Access Limits
Whether the same file can be open multiple times (§4.9.3)
Opening a file that is already open is not permitted.
C.14.19 Deleting Open Files
The effect of the remove function on an open file (§4.9.4.1)
The remove function deletes a file, even if the file is open.
C.14.20 Renaming with a Name that Exists
The effect if a file with the new name exists prior to a call to the rename
function (§4.9.4.2)
If you attempt to rename a file using a name that exists, the rename
function fails and returns an error code.
C.14.21 Printing Pointer Values
The output for %p conversion in the fprintf function (§4.9.6.1)
Microsoft C supports three types of pointer conversions: %p (a pointer), %lp
(a 32-bit far pointer), and %hp (a 16-bit near pointer).
The fprintf function produces hexadecimal values of the form XXXX (an
offset) for near pointers or XXXX:XXXX (a segment plus an offset, separated
by a colon) for far pointers. The output for %p depends on the memory model
in use.
C.14.22 Reading Pointer Values
The input for %p conversion in the fscanf function (§4.9.6.2)
When the %p format character is specified, the fscanf function converts
pointers from hexadecimal ASCII values into the correct address.
C.14.23 Reading Ranges
The interpretation of a dash (-) character that is neither the first nor the
last character in the scanlist for % [ conversion in the fscanf function
(§4.9.6.2)
The following line
fscanf( fileptr, "%[A-Z]", strptr);
reads any number of characters in the range A-Z into the string to which
strptr points.
C.14.24 File Position Errors
The value to which the macro errno is set by the fgetpos or ftell function
on failure (§4.9.9.1, 4.9.9.4)
When fgetpos or ftell fails, errno is set to the manifest constant EINVAL if
the position is invalid or EBADF if the file number is bad. The constants
are defined in ERRNO.H.
C.14.25 Messages Generated by the perror Function
The messages generated by the perror function (§4.9.10.4)
The perror function generates these messages:
0 Error 0
1
2 No such file or directory
3
4
5
6
7 Arg list too long
8 Exec format error
9 Bad file number
10
11
12 Not enough core
13 Permission denied
14
15
16
17 File exists
18 Cross-device link
19
20
21
22 Invalid argument
23
24 Too many open files
25
26
27
28 No space left on device
29
30
31
32
33 Math argument
34 Result too large
35
36 Resource deadlock would occur
C.14.26 Allocating Zero Memory
The behavior of the calloc, malloc, or realloc function if the size
requested is zero (§4.10.3)
The calloc, malloc, and realloc functions accept zero as an argument. No
actual memory is allocated, but the memory size can be modified later by
realloc.
C.14.27 The abort Function
The behavior of the abort function with regard to open and temporary files
(§4.10.4.1)
The abort function does not close files that are open or temporary. It does
not flush stream buffers.
C.14.28 The atexit Function
The status returned by the atexit function if the value of the argument is
other than zero, EXIT_SUCCESS, or EXIT_FAILURE (§4.10.4.3r)
The atexit function returns zero if successful, or a nonzero value if
unsuccessful.
C.14.29 Environment Names
The set of environment names and the method for altering the environment
list used by the getenv function (§4.10.4.4)
The set of environment names is unlimited.
To change environment variables from within a C program, call the putenv
function. To change environment variables from the DOS command line, use the
SET command (for example, SET LIB = D:\ LIBS).
Environment variables exist only as long as their host copy of DOS is
running. For example, the line
system( "SET LIB = D:\LIBS" );
would run a copy of DOS, set the environment variable LIB, and return to the
C program, exiting the secondary copy of DOS. Exiting that copy of DOS
removes the temporary environment variable LIB.
Likewise, changes made by the putenv function last only until the program
ends.
C.14.30 The system Function
The contents and mode of execution of the string by the system function
(§4.10.4.5)
The system function executes an internal DOS or OS/2 command, or an EXE,
COM, or BAT file from within a C program rather than from the command line.
It examines the COMSPEC environment variable to find the command
interpreter, which is typically COMMAND.COM in DOS or CMD.EXE in OS/2. The
system function then passes the argument string to the command interpreter.
C.14.31 The strerror Function
The contents of the error message strings returned by the strerror function
(§4.11.6.2)
The strerror function generates these messages:
0 Error 0
1
2 No such file or directory
3
4
5
6
7 Arg list too long
8 Exec format error
9 Bad file number
10
11
12 Not enough core
13 Permission denied
14
15
16
17 File exists
18 Cross-device link
19
20
21
22 Invalid argument
23
24 Too many open files
25
26
27
28 No space left on device
29
30
31
32
33 Math argument
34 Result too large
35
36 Resource deadlock would occur
C.14.32 The Time Zone
The local time zone and Daylight Saving Time (§4.12.1)
The local time zone is Pacific Standard Time. Microsoft C supports Daylight
Saving Time.
C.14.33 The clock Function
The era for the clock function (§4.12.2.1)
The clock function's era begins (with a value of 0) when the C program
starts to execute. It returns times measured in 1/1000th seconds.
INDEX
──────────────────────────────────────────────────────────────────────────
80x87 coprocessor
80x87
Detection of
A
Alternate math package
B
_based keyword
C
C extensions, PWB
building protected-mode
building real-mode
calling C library functions
calling C library routines
calling PWB functions
describing functions and switches
initializing functions
prototyping functions
receiving parameters
sample
versus executable files
C extentions, PWB
building protected-mode
Calls
Emulator math package
Emulator package
Floating-point
Math coprocessor package
CODEVIEW.LST file
CodeView
debugging
DLLs with
CONFIG.SYS file
CONFIG.SYS files
_control87
Coordinates
overview
physical
screen location
viewport
window
Coprocessor
CURRENT.STS
D
Default math package
Denormalized numbers
Dot commands
double
E
80x87 coprocessor
80x87
Detection of
EMOEM.ASM
Emulator math package
Emulator package
In-line
With dynamic-link libraries
Environment
NO87 variable
F
_far keyword
_fastcall keyword
Files
.FON
CODEVIEW.LST
CONFIG.SYS
CURRENT.STS
TOOLS.INI See TOOLS.INI
Fill patterns
float
Floating-point accumulator
Floating-point math
requirements, DLLs
Floating-point
Alternate math package
Biased exponent
Compatibility between options
Floating-Point
Controlling
Floating-point
Default math package
Default package
Denormalized numbers
Effect of calls on code size
Effect of calls on speed
Effect on optimization
Emulator package
Exception masking
Exceptions
exponent
Fastest programs
In libraries
Infinities
Interrupt-enable
Library considerations for
Mantissa
Math coprocessor package
Maximizing accuracy
NaNs
On non-IBM compatible computers
Packages
Precision
Program size
Program speed
Sign bit
Smallest programs
Transcendental function results
Underflow
Using in dynamic-link libraries
With libraries not provided by Microsoft
.FON files
/FPa
/FPc
/FPc87
/FPi
/FPi87
Function calls
near call
C calling convention
FORTRAN/Pascal calling convention
register calling convention
_fastcall calling convention
Functions
drive and directory (list)
graphics (list)
initializing
prototyping
Returning floating-point types
WhenLoaded
G
/Gd option
Gd option
Graphics
video modes
default mode
graphics mode, defined
text mode, defined
H
Help files
local help context
HIMEM.SYS driver
_huge keyword
_huge Keyword
_huge keyword
I
IEEE
In-line assembly
advantages
In-line
Floating-point emulator package
Floating-point emulator
Floating-point instructions
Floating-point math coprocessor package
Floating-point
Institute of Electrical and Electronics Engineers
see IEEE; see
L
Language conventions
calling conventions
naming conventions
parameter-passing conventions
Libraries
dynamic-link See DLLs
import
special
LINK
/EXEPACK option
/FARCALLTRANSLATION option
/NODEFAULTLIBRARYSEARCH (/NOD) option
/NOEXTENDEDDICTSEARCH (/NOE) option
/NOIGNORECASE (/NOI) option
/PACKCODE option
/PACKDATA option
/PADCODE option
/PADDATA option
/TINY option
compatibility (/Lc)
PACKCODE option
protected-mode (/Lp)
real-mode (/Lr)
LLIBCDLL.LIB
long double
M
Macros
inherited
Math coprocessor package
In-line
N
_near keyword
NO87
O
optimise pragma
Optimization
Effect of floating-point math on
optimize pragma
OS/2
calling
P
Process
child
debugging multiple processes
Programmer's WorkBench
see PWB; see
Pseudotargets
PWB
80x87 option
Debug Build Options
Emulation calls option
extensions.See C extensions, PWB
Fast alternate math option
Global Compile Options
Inline 80x87 Instructions option
Inline Emulation option
Release Build Options
Selecting floating-point options from
R
Run-time
support of type long double
S
SETUP
SLLs
data segments
T
Threads
_beginthread function
_endthread function
/TINY option
Type
double
float
long double
Promotion of floating point
Range of floating-point
Significance of floating-point
Storage requirements for floating-point
Widening for floating-point types
Types
double
float
long double
V
Variables
Declaring as floating-point types
Precision with floating-point
Promotion of floating-point
Range of floating-point
Significance of
Storage requirements for