reflection_in_cpp
Reflection in C++ - Past, Present, and Hopeful Future - Andrei Alexandrescu
Formal definition: reflection is the ability of a program to observe its own code and shape its behavior accordingly.
- In C++, we are going to focus on compile-time reflection, not runtime, and the focus is on generic code.
- A good definition for C++ reflection would be:
the ability of a template to query details of its template parameters and shape its definition accordingly.
- This allows the template to morph itself by adding or removing data members, methods, type definitions, and more, depending on its template parameters.
Designing Code by Reflection in C++
Two Main Components:
- Reflection Proper: Querying attributes of a type (e.g., methods, copyability, virtual destructors).
- What was the author of the type thinking?
- Insertion: Using information from reflection to change or generate new code.
- Also known as code generation based on reflection.
Three Stages of Design by Reflection:
- Extraction (Reflection Proper):
- Query and extract information about types.
- Represented in the current C++ proposal.
- Processing (Synthesis):
- Use extracted information to create new information.
- Evaluated during compilation.
- More power needed for full capabilities.
- Generation (Insertion):
- Use processed information to generate new code.
- Passed back to the compiler for further compilation.
- Not well-represented in the current C++ proposal, needs improvement.
Andre's Theory of Programming Language Design
Human Endeavor in Programming:
- Reasoning About Self and Others:
- A fundamentally human skill, e.g., empathy, understanding context.
- Suggested to be incorporated into programming languages.
Contrast with Technology-Driven Features:
- Configurable Syntax:
- Possible in technology but not a human endeavor.
- Receives periodic interest but not aligned with how the human mind works.
- Suggested to be avoided in programming language design.
Key Takeaways:
- Programming languages should focus on features that align with human cognitive abilities.
- Avoid features like configurable syntax that, while technologically possible, are not naturally intuitive for humans.
Crisis in Software Development: The Million Lines of Code Challenge
The Problem:
- Growing Codebases: Every project is heading towards 1 million lines of code or more.
- Bug Rate: One bug per 1,000 lines in large projects, even with high budgets.
- Language-Independent: Bug rate is proportional to lines of code, regardless of the programming language.
- Super-Linear Growth: Complexity grows faster than the size of the code.
Cultural Attitude:
- Code as Achievement: Many see more lines of code as an accomplishment, not a liability.
- Lack of Attention: Not enough focus on the sheer size of codebases.
Current Language Mechanisms:
- Type Systems, Concepts, Checkers, Linters: Constrain code but also force more lines to be written.
- Eliminate Good and Bad Programs: Ideal features should only eliminate bad programs, but that's not always the case.
Python vs C++:
- Python's Simplicity: Python users can achieve more with fewer lines, questioning the need for complex type systems.
Proposed Solution:
- Increase Leverage: Look for language features that allow more to be done with fewer lines of code.
- Generative Aspect: Good language features should enable more correct programs from fewer lines of code.
Key Takeaways:
- The size of codebases is a growing problem that needs more attention.
- Language features should aim to reduce the number of lines needed for correct programs.
- There's a need to shift the cultural attitude that equates more lines of code with success.
The Boilerplate Problem in Software Development
Common Misconception:
- Not Just Boilerplate: While boilerplate code is a problem, it's not the only issue. There's also creative code that adds to the complexity.
Types of Boilerplate:
- Obvious Boilerplate: Contiguous blocks of repetitive code that are easy to spot.
- Interspersed Boilerplate: Mixed with functionality, making it hard to remove. Includes repeated declarations and signatures.
Language Limitations:
- Namespace Detail: Indicates a failure of the language to provide more elegant solutions.
- Template Complexity: C++ templates require a lot of boilerplate, especially for recursion and specialization.
Real-World Example:
- Boost Library: Despite being a great library, it contains a lot of boilerplate.
- Unordered Map: 7,000 lines of code, making it hard to understand how collisions are solved or where the actual work is done.
Beyond Boilerplate:
- Reflection: While boilerplate is a problem in C++, reflection offers more than just reducing boilerplate.
Key Takeaways:
- Boilerplate is a symptom of deeper issues, including language limitations and the complexity of creative code.
- Different types of boilerplate exist, some more insidious than others.
- There's a need for language features that reduce boilerplate and increase code efficiency.
The Challenge of Creating a "Tainted String" Type in C++
The Concept:
- Tainted String: A string type that is not trusted and needs validation.
- Strong Typing: Known as "strong type def" or "strong using" in C++.
Implementation Challenges:
- Type Aliasing: Using
typedeforusingonly creates an alternate name, not a new type. - Public Inheritance: Not recommended for value types.
- Private Inheritance: Recommended but comes with its own set of problems.
Code Example:
cppCopy code
struct tainted_string : private std::string {
using std::string::empty;
using std::string::size;
//...
};
Issues with Private Inheritance:
- Using Directives: Need to explicitly specify which members to inherit.
- Constructors: Don't take the right types.
- Iterators: Need to create a new iterator type for
tainted_string. - Methods: Methods that take or return
std::stringneed to be rewritten fortainted_string.
Result:
- Code Bloat: 700 lines of code for a type that does nothing algorithmically.
- Lack of Abstraction: Can't encode the notion of copying a type, which is the opposite of abstraction.
Key Takeaways:
- Creating a new type based on an existing one is not straightforward in C++.
- The process involves a lot of boilerplate and manual work.
- This example highlights the need for better abstraction mechanisms in programming languages.
The Challenge of Customizing Existing Types in C++
The Concept:
- Custom Types: Often, developers want a type that is almost identical to an existing one but with slight modifications.
- Examples:
- A hash table that throws an exception on bracket operator use.
- An
std::vectorthat grows automatically without throwing out-of-bounds exceptions. - A URL or file type that is mostly like
std::stringbut with encoding/decoding features.
The Irony of Inheritance:
- Original Promise: Inheritance was supposed to allow easy customization of existing types.
- Reality: Inheritance from value types is discouraged in C++ literature.
- Workaround: Developers still use inheritance from value types, despite the risks.
Key Issues:
- Customization: The ability to tweak an existing type for specific needs is crucial for true code reuse.
- Inheritance Limitations: Inheritance often doesn't deliver on its promise of easy customization, especially for value types.
The Limitations of Container Adapters in C++
The Concept:
- Container Adapters: These are wrappers around existing container types to provide specific functionalities.
- Examples:
std::stack,std::queue, etc.
The Problem:
- Rigidity: Container adapters are rigid and code to the least common denominator of the container they adapt.
- Limited Success: They are not widely used because of their limitations.
Desired Features:
- Adaptive Behavior: Ideally, a container adapter should adapt its functionality based on the capabilities of the underlying container.
- Optional Enhancements: Should offer enhanced functionality when the underlying container supports it.
Example:
std::stack: Developers often avoid usingstd::stackbecause it lacks features like random access, which might be available in the underlying container likestd::vector.
Key Takeaways:
- Container adapters in their current form are too rigid and don't adapt to the capabilities of the underlying container.
- There's a need for more flexible container adapters that can offer enhanced functionalities based on the capabilities of the underlying container.
Pitfalls of attempting to write container-independent code. (Scott Meyers' View)
- Effective STL Item 2: Scott Meyers explaininged the pitfalls of attempting to write container-independent code.
- Limited Recourse: You have very little flexibility to adapt the code to specific needs or optimizations.
Key Takeaways:
-
The idea of container-independent code is largely an illusion due to the rigid limitations it imposes.
-
This approach often leads to more boilerplate and less efficient code.
-
Static reflection is on the verge of becoming mainstream in C++ due to advancements in compile-time evaluation and data structure manipulation.
-
The language wasn't ready for it before because it lacked good constant compile-time evaluation and the ability to create complex data structures during compilation.
-
Recent additions like
constexprand the ability to usenewduring compilation are making C++ more suitable for compile-time programming and reflection. -
Earlier approaches to static reflection in C++ were template-based, which were hard to work with and led to memory consumption issues during compilation.
-
There has been a proposal for inheritance-based reflection, but it proved to be inefficient and difficult to use in practice.
-
The optimal approach for C++ is value-based reflection, allowing developers to use familiar C++ constructs like loops and tests during compile-time against type information.
Proposed C++ primitives
#incluide <meta>
// change of language proposed
// ^T: "reify T" where T is identifier/expression ...etc
// yields a constexpr value of type meta::info
// convert a value back to alias through [:e:]
// where e is a compile-time value of type meta::info
// this yields a template name or type.
// [:^T:] is T, and ^[:e:] is e, they are like & and * for lvalues.
tempalte<typename T>
requires(std::is_enum_v<T>) constexpr std::string to_string(T value) {
template for (constexpr auto e : std::meta::members_of(^T)) {
if ([:e:] == value) {
return std::string(std::meta::name_of(e));
}
}
return "<unnamed>";
}
- The proposed C++ code example demonstrates the use of static reflection to convert enumerated types to strings.
- The code introduces two new language constructs:
^Tto "reify" a type into a value, and[:e:]to convert a value back to a type or alias. - The
template forloop iterates over all the members of the enumerated type, using thestd::meta::members_offunction. - The
ifstatement compares the runtime value of the enumerated type with each member, converting it back to a type using the[:e:]syntax. - The function returns the name of the enumerated value if it matches, or "
" otherwise. - The code aims to reduce boilerplate by automating the generation of code for converting enumerated types to strings.
- The example highlights the power of static reflection and compile-time programming in C++, allowing for both compile-time and runtime functionality.
- The proposal aims for minimal changes to the language, focusing on adding powerful features for metaprogramming.
- The use of iteration is emphasized as crucial for reducing the amount of code and for generating boilerplate automatically.
- The example also suggests that the new syntax could be used in a
switchstatement, further enhancing its utility.
tempalte<typename T>
requires(std::is_enum_v<T>) constexpr std::string to_string(T value) {
switch (value) {
template for (constexpr auto e : std::meta::members_of(^T)) {
case [:e:]: return std::string(std::meta::name_of(e));
}
}
default: return "<unnamed>";
}
- The updated C++ code example introduces algorithm selection based on the size of the enumerated type.
tempalte<typename T>
requires(std::is_enum_v<T>) constexpr std::string to_string(T value) {
if constexpr (std::meta::members_of(^T).size() <= 7) {
template for (constexpr auto e : std::meta::members_of(^T)) {
if ([:e:] == value) {
return std::string(std::meta::name_of(e));
}
}
} else {
switch (value) {
template for (constexpr auto e : std::meta::members_of(^T)) {
case [:e:]: return std::string(std::meta::name_of(e));
}
}
}
return "<unnamed>";
}
- The function
to_stringnow uses anif constexprstatement to decide between two different algorithms for converting an enumerated value to a string. - If the size of the enumerated type is less than or equal to 7, the function uses a
template forloop to iterate over each member and compare it to the input value. - If the size is greater than 7, the function uses a
switchstatement to improve performance. Thecaselabels are generated using the[:e:]syntax to convert compile-time values back to types. - This approach allows the function to adapt its behavior based on the characteristics of the input type, optimizing for either code size or performance as needed.
- The example demonstrates the power of static reflection and compile-time programming to make decisions about code generation, a capability usually reserved for compiler writers.
- This level of control allows for more efficient and adaptable code, but also comes with the responsibility to use these powerful features wisely.
- The code example uses Andrei's "Rule of Seven" to decide between two different algorithms for converting an enumerated type to a string. If the enumerated type has seven or fewer members, it uses a cascade of
ifstatements. Otherwise, it uses aswitchstatement. - The most powerful part of the example is the common path, represented by the line
return "<unnamed>";. This line is shared between the two different algorithms, demonstrating the ability to reuse code even when different paths are taken. - The limitation of
if constexpris that it introduces a new scope, preventing the definition of a variable that can be used in the common path. The example is carefully chosen to avoid illustrating this limitation. - The use of
if constexpris "painfully close" to being useful for more general code injection into the existing scope, but falls short because it introduces a new scope. - The example illustrates the high density of code and the reuse of common paths, which are key advantages of this approach.
- While compilers already perform similar optimizations, the general case enabled by static reflection and compile-time programming allows for more sophisticated algorithm selection that compilers cannot do.
- The ability to choose algorithms and generate code conditionally gives programmers immense power, similar to what compiler writers have, but it should be used responsibly.