Primitive types
- Integer types are sized, except for
isizeandusize, whose width is architecture-specific, and corresponds to C++std::ssize_t/std::intptr_tandstd::size_t/std::uintptr_trespectively. char!=u8!=i8, oh thank god- String literals
"foo"in UTF-8. They are slices (see below). - Tuples are built into language, use parentheses syntax, roughly correspond to
std::tupleandstd::pair. Can also do.isyntax to access the i-th element. Unit tuple(). - Fixed-size arrays are built into language, use bracket syntax, roughly correspond
to
std::arraywith mandatory bounds-checking. - Simple types implement
Copytrait (i.e. scalars, correspond to value types in Java) and their assignment copies. Tuples copy if all their elements copy, and this may include arbitrary large tuples. Otherwise, assignment moves. A copy is done by an explicit.clone()operation. - String slices have type
&str, can be created with&var[x..y]syntax,xand/orymay be omitted if they are zero and length respectively. Slice consists of a pointer and length. - General slices are very similar. They have
&[element_type]type. !is an empty type with no values. Used as return type for functions that never return. It can be coerced into any other type, which is used for i.e.matcharms which e.g.continueorpanic!.
Variables & type inference
- Variables (
let) are immutable by default, but their names can be shadowed, allowing sequences of changes in what a given variable name means. The type can be changed too. Mutable variables can be declared withlet mut. - Even immutable variables may be declared without initialization.
- Variable types are inferred, but can be annotated when/if needed.
- Constants are declared with
const, and require type annotations. - Function parameter and return types always must be provided.
Lifetimes, references & borrow checker
- All variables are owned by their enclosing scope. The ownership may be passed
around, when the last owner scope exits, the variable is destroyed (
dropis executed if theDroptrait is implemented, memory released). - Passing variable into a function moves it (or copies), transferring its ownership into the function. Likewise returning it moves it (or copies), transferring the return value ownership to the caller.
- Creating references is called borrowing. Immutable references do not allow
modifying the pointed-to variable and are borrowed with
&, can be passed around without transferring ownership. Several of them may be active at the same time. Mutable references are borrowed with&mut, and no other mutable or immutable references may exist at the same time. - Each reference has a lifetime. Most cases are handled by implicit lifetimes.
Lifetimes are named by
'a, usually very short names, placed after&. - For function signatures, generic lifetime parameters use angle brackets:
fn foo<'a>(x: &'a str, y: &'a str) -> &'a str
- Lifetime annotations in struct definitions limit struct lifetime to that of its fields.
- If there are multiple input lifetime parameters, but one of them is
&selfor&mut self, its lifetime is assigned to all output lifetime parameters. - Deref coercion converts a reference to a
Deref-implementing type to a reference to a different type. I.e.&Stringto&str. Happens automatically on parameter-argument type mismatch: from&Tto&UwhenT: Deref<Target=U>. Internally as many.deref()calls are inserted as needed. - For mutable references, implement
DerefMuttrait. Two extra deref coercion rules: from&mut Tto&mut UwhenT: DerefMut<Target=U>and from&mut Tto&UwhenT: Deref<Target=U>. - The
Droptrait is the closest thing to a C++ destructor, adds adropmethod that takes a mutable reference toself. - For structs, field lifetimes are part of the struct type, and should be specified where the struct type is specified.
'staticlifetime is the lifetime of the whole program.- Lifetimes are a type of generics, so a function with both lifetimes and generic type parameters lists both together in angle brackets.
On paper Rust lifetimes appear to be a genius idea. Besides manual resource management and GC, this is a viable third option that combines the advantages of the two while avoiding their disadvantages, althought only partially so.
Google Chrome developers C++ tried this and did not succeed enough for it to be viable: Borrowing Trouble: The Difficulties of a C++ Borrow-Checker.
Statements & expressions
- Last nonterminal symbol in a block may be expression (lacking the final semicolon),
in which case the whole block is an expression with this return value. This makes
return expression;in functions replaceable withexpression. This also mergesifstatement and ternary operator to a singleifexpression. loopstarts an infinite loop.breakmay return an expression, making the loop an expression. Nested loops may have labels'label: loop {, then possible to dobreak 'label;.foris a range loop. When possible,forloops seem to be more idiomatic thanwhileloops.
Type aliases, structs & enums
- Rust type alias
type Foo = ExistingTypeis like C++using Foo = ExistingType. Can be generic:type Foo<T> = std::result::Result<T, std::io::Error>. - Structures use
structkeyword, contain only type-annotated fields. - If during a struct variable construction a field and a var it is initialized from have the same name, one of the can be omitted (field init shorthand).
..varin the struct variable construction takes all the unspecified fields fromvarof the same struct type.- Tuple structs
struct foo(i64, i64, i64)are structs that are very similar to tuples. Fields are unnamed. - Unit structs
struct Foo; - If structs need to store references, then lifetimes have to be used, that's for later.
- Attribute
#[derive(Debug)]for struct allows doing{:?}inprintln!to dump the fields. dbg!(value)macro maybe inserted as an expression to dumpvalue- struct may have methods attached to them, in separate
impl StructNameblocks. - The first arg of a method may be one of
&self, corresponds to a C++constmethod;&mut self, corresponds to a regular C++ method;self, consumes the object;
- A function in an
impl StructNamenot taking aselfis an associated function, not a method, corresponding a C++ static method. Called throughStructName::syntax. - There may be muliple
implblocks. - The simplest Rust enum roughly matches C++ enum class. But then each enum variant
may have different associated data with it, making it similar to
std::variant - Standard library
Optionenum handles the use cases fornullptr(a null reference does not exist in Rust). Similar to C++std::optional.
Closures & function pointers
- Closures:
||with parameters inside followed by an (optionally bracketed) expression. Parameter and return types are not annotated usually. - Once closure types are inferred, they don't change, cannot call the same closure with different ones.
- Variables are captured by different types of borrowing / ownership taking
implicitly depending on what the code does.
movekeyword before||forces taking ownership, when the body does not need it implicitly. One use case is passing data to a new thread. - All closures implement
FnOncetrait, meaning they can be called once. - Closures that mutate captured values but don't move them out implement
FnOnceandFnMut. - Closures that don't mutate captured values and don't move them out implement
FnOnce,FnMut, andFn. - All functions coerce to the
fntype, which is the function pointer type. It implements all ofFnOnce,FnMut,Fn. - To return a closure, use a trait object, e.g.
-> Box<dyn Fn ... >
Generics & traits
- Generics (types) and traits (behavior) resemble C++ templates. Traits also resemble interfaces in other languages.
- Generic type uses must be constrained by traits–no SFINAE. C++ concepts.
impl Foo<f32> {...}adds implementation for a specific type, similar to C++ template specialization.- Separate traits are implemented for structs in separate blocks:
impl Trait for Struct { ... } - Traits need to be brought into scope too, pulling in an implementing type is not sufficient to call trait methods.
- Can implement a local trait on an external type or an external trait on an local
type, but not an external trait on an external type (so no C++
std::hashspecialization for astd::type). This is to avoid allowing multiple trait implementations for the same type. - Trait methods may have default implementations, which may call other, possibly unimplemented methods in the same trait. The default implementation may not be called from an overriding implementation.
- Trait-type parameters without generics syntax:
fn foo(bar: &impl Trait). - Trait bounds, using generics syntax:
fn foo<T: Trait>(bar: &T). Same as above. - Multiple trait bounds:
fn foo(bar: &(impl Trait1 + Trait2))andfn foo<T: Trait1 + Trait2>(bar: &T) - In the case trait bounds become long,
whereclauses pull them aside:
fn foo<T, U>(t: &T, u: &U) -> Result
where
T: Trait1 + Trait2,
U: Trait1 + Trait3,
{
...
}
- Can use
fn ... -> impl Traitto return a trait-implementing type, as long as it's a single type. - Can conditionally implement methods for generic structs by adding trait bounds to
their implementation:
impl<T: Trait> Type<T> { ... }. These are called blanket implementations. - Rust does not have OOP inheritance. Some form is available through default trait method implementations. Dynamic dispatch (C++ virtual methods) is through trait objects.
- A trait object is pointer to an instance of a type and a pointer to a vtable.
- Struct and enum vars in Rust are not objects, trait objects come close, but they cannot contain data.
- Must be a reference (or a smart pointer) to
dyntrait type, i.e.Box<dyn Trait>. - Associated types.
type Nameallows to useNameas a type in a trait before its declaration is given by the trait implementors. In C++ one would use template argument dependent typenames. - Default generic type parameters:
<T=DefaultType>. foo.bar()can be replaced byType::bar(&foo)whenbaris implemented by more than trait to disambiguate. If even more disambiguation is needed, i.e. for associated methods withoutselfparameter,<Type as Trait>::barcalls the method fromTraitas implemented forType.- If a trait depends on another trait, the latter is called a supertrait:
trait Foo: SuperTraitBar. - Newtype pattern. One use case: implement external traits on external types, declare a new thin wrapper tuple struct. There are other use cases.
Error handling
panic!macro exits (or aborts, depending on config) on unrecoverable error.- Errors are handled using
Resultenum, which can beOkorErr. unwrapreturns the success variant ofResultor panics.unwrap_or_elseexecutes given code instead of panicking.expectis likeunwrapwith a given error message for panicking.?operator after a call, e.g.let foo = bar()?;unwraps returnedResult, or returns from the caller with the error. If the error types do not match,Fromtrait converts.?operator works withOptionreturn types too.
Iterators
- Rust iterators correspond to C++ ranges (or iterator pairs).
- Calling
.iter()on a collection roughly corresponds to a C++.cbegin(), except that the latter is not a range. Other options are.into_iter()to take ownership of values–not sure what a direct C++ mapping would be–and.iter_mut()over mutable references (C++.begin()). - Iterators implement the
Iteratortrait. - Iterator
.collect()method returns a new collection from iterating. - Code using iterator adapters might be faster than equivalent loop-based code. An example of Rust zero-cost abstractions, which of course is found in C++ as well.
Pattern matching & related control flow
- Pattern matching can decompose a tuple to local vars, and do many other things.
matchis a generalizedswitchwith pattern matching, variable binding, and more._is a catch-all non-binding pattern, like thedefaultinswitch. Ignores the entire value.if letbehaves like a singlematcharm, combiningifwith variable binding in the case of true condition.while letloop repeats until its pattern matches.- The value after
forkeyword in aforloop is a pattern. letkeyword takes a pattern, not a variable id.- Function parameters are patterns.
- Patterns are refutable and irrefutable, the latter ones matching any possible
passed value. Function parameters,
let, andfortake irrefutable patterns.if letandwhile lettake both kinds, with a compiler warning if irrefutable (as that creates always-true if or an infinite loop while).matcharms must be refutable except for the last one, which should be irrefutable (if the possibilities were not exhausted until then). - Multiple patterns can be combined with
|. - An inclusive range of values can be matched with
..=. The range cannot be empty. - Struct destructuring:
Foo { x: a, y: b } = vgetsaandb. If field and var names match, thenFoo { x, y } = v. Literals can be used too. - Enum destructuring
Foo::Variant { x, y },Foo::VariantWithNoData. - Can destructure arbitrarily deep nested structs and enums.
- Nested
_ignores just that part. - Starting a variable name with an
_suppressed unused variable warnings for it. ..is a greedy sequence of_, i.e.
let numbers = (1, 5, 7, 20, 30);
match numbers {
(first, .., last) => ...
}
matcharms may have match guards which are extraifconditions that can use the bound vars.Some(x) if x > 5. Exhaustiveness is not checked.- @ bindings allow to create a var holding the tested value at the match time, i.e.
Message::Hello { id: id_var @ 3..=7 }.
Standard library types
- Dynamic strings:
String, would correspond tostd::stringtype, but UTF-8.Displaytrait addsto_stringmethod. Not indexable to avoid byte/UTF-8 encoding mixup. Slicing is allowed but runtime-checked to fall on char boundary. To disambiguate byte/char interpretation, use.chars()or.bytes(). - Standard library vectors match
std::vector. A macrovec![1, 2, 3]to create a vector with given contents. Ownership/borrowing rules apply to whole vector, i.e. if a mutable reference to the first element is taken, a new one cannot be pushed to the back. - Standard library hash maps correspond to
std::unordered_map. std::thread::spawn(closure)->std::thread::thread(callable).
Smart pointers & dynamically-sized (unsized) types
- Smart pointers own data. They implement
DerefandDroptraits.StringandVec<T>are smart pointers. Box<T>is likestd::unique_ptr<T>in C++, except that Rust is more likely to use plain references and lifetimes, so no 1:1 mapping in i.e. rewrite.Box::newisstd::make_unique.- Implementing
Dereftrait enables dereferencing with the*operator, like overloading C++*and->operators does. - Under the hood
*xis transformed to*(x.deref())exactly once. std::mem::dropcorresponds to C++std::unique_ptr::resetor other early destruction.Rc<T>matchesstd::shared_ptr.Rc::clonemethod matchesstd::shared_ptrcopy constructor.- Interior mutability pattern:
unsafecode to mutate data inside an immutable value even with immutable references present. RefCell<T>does borrow checking at runtime instead of compile time.borrowandborrow_mutmethods.Rc<RefCell<T>>pattern implements multiple owners to potentially-mutable data.Weak<T>matchesstd::weak_ptr. Constructed byRc::downgrade. Upgraded byupgrademethod, corresponding tostd::weak_ptr::lock.- Dynamically sized types (DST) or unsized types, whose sizes are only known at the
runtime. Cannot create variables of such types directly, naturally always hidden in
some pointer + size structure. Rust automatically implements
Sizedtrait for every non-DST, and implicitly bounds by it for every generic function. To relax the latter,fn foo<T: ?Sized>.... The?Traitsyntax is only available forSizedtrait.
Operator overloading
- Operator overloading by implementing the desired traits in
std::ops.
Concurrency
thread::spawnreturns aJoinHandle, which has ajoinmethod, similar to C++std::thread::join.- Message passing for inter-thread communication, like in Go. Channels,
std::sync::mpsc::channel(). The endpoints havesend,recv,try_recvmethods. The receiver implementsIteratortoo. The channels may have multiple transmitters (it's MPSC), which can be created by.clone. - Messages must implement
Sendtrait. If a type is composed ofSendtypes only, it becomesSendautomatically. - Types whose variables are safe to be referenced from multiple threads implement
Synctrait. If&TisSend, thenTisSync. A type made ofSynctypes only isSyncautomatically. Mutex<T>is a mutex-guarded variable ofT..lock()returns aLockResult, which has a (potentially mutable) referenceMutexGuardto the guarded data. The guard unlocks when it goes out of scope. Mutex implements interior mutability.Arc<T>corresponds to C++std::atomic<std::shared_ptr>.- To actually share mutexes between threads, wrap them:
Arc<Mutex<T>>.
Assorted standard library functionality
std::env::argsis forint argc, char *argv[]. It's Unicode-checking, if that hurts thenstd::env::args_os.std::process::exitis forexitstd::env::varis forgetenvprintln!prints tostdout,eprintln!tostderr.
Build and dependency management
- cargo seems to be a much better story than CMake hell or its alternatives.
- A crate is the smallest compilation unit, either a library crate, or a binary
crate. Usually means the former. A crate root is the starting source file in it. A
package is a bundle of crates with at most one library crate. Standard paths inside
a package:
src/main.rs,src/lib.rs,src/bin. - Release profiles correspond to a mixture of CMake build configurations,
NDEBUGdefine, etc. in C++.devprofile corresponds toDebug, andrelease~to ~Release(orRelWithDebInfo?). - Can customize profiles in
Cargo.toml, i.e. optimization levels. - Workspaces organize related packages together in large projects, to share directory
root,
Cargo.toml, andCargo.lock.
Modules
Not familiar enough with C++ modules to compare.
- Modules (and submodules) inside a crate do namespaceing and public/private.
src/modulename.rs,src/modulename/submodulename.rs. Modules can be private or public, declared withpub modandmod. super::as a part of name path goes one level up.usekeyword imports. Idiomatically functions are imported through their parent module, everything else directly.use ... as ...creates name alias.pub usere-exports. Used to organize and collect public API from several potentially nested submodules.- Nested path syntax:
use foo::{bar, baz, self};, globsuse foo::*;
Tooling
rustfmtformats, so doesclang-format.Clippythe linter.rust-analyzerfor LSP support.
Documentation
- Documentation header comments start with
///and support Markdown. Built bycargo doc [--open]. - Typical API doc sections: Examples, Panics, Errors, Safety.
- Contained documentation comments start with
//!, typically used for crates and modules.
Testing & benchmarking
#[test]annotates a function to be a test, i.e. Google TestTESTmacro in C++. Tests run in parallel by default.assert_eq!is like gtestASSERT_EQ, except that the args are 'left' and 'right' instead of 'expected' and 'actual' or similar. Equality asserts may be applied on types implementingPartialEqandDebugtraits.assert!may take 2nd and subsequent args for a message in the case of failure.- Tests annotated with
#[should_panic]test that the annotated function panics, similar but not identical to gtest death tests. Best to addexpectedparameter to the attribute to specify the reason for panic. - Tests may also be implemented by returning a
Result<T, E>. - Benchmark tests correspond to Google Benchmark, but unstable ATM.
- Documentation tests can compile API examples automatically.
- Unit tests go with the code they test,
mod testsannotated with#[cfg(test)] - Visibility rules happen to allow the testing of private functions.
- Integration tests go to a top-level
testsdirectory, no configuration annotation. Each file there is a separate (test) crate–if that's not what's needed, i.e. for setup code, usefoo/mod.rsnaming convention for non-tests. cargo testruns in sequence: unit, integration, doc, does not go to the next category if failure.- Binary crates cannot have integration tests directly. The usual thing to do is to always have a library crate with a binary crate as minimal as possible.
Macros
- There are macros, names trailing with exclamation mark (
println!). - Macros can take Rust code and expand to a different Rust code. A difference from C++ preprocessor that it works on the AST, not textually. While powerful, how well does this work with tooling? Do they run macros? Can they refactor macros?
- Declarative macros (
macro_rules!) pattern-match given code to produce code. #[macro_export]annotation for public macros.- Procedural macros take token stream input and produce token stream output.
- One kind is custom
derivemacros that add code to a struct implementation. - Attribute-like macros allow creating new attributes.
- Function-like macros are close to C preprocessor function-like macros, except that
they also operate on
TokenStreamand not on arguments directly. Can take variable number of arguments.
Unsafe Rust & FFI
unsafe { ... }: allows some, well, unsafe featuresunsafecan dereference raw pointers*const T,*mut T. Raw pointers are just like C raw pointers.unsafe fn foo() {}, thenfncan be called from unsafe code.extern "C" { fn putenv ... }for FFI, may only be called from unsafe code.- To make Rust function callable from external code, add
#[no_mangle]annotation andpub extern "C"before thefn. - Static variables may be declared with
static FOO_BAR: type = value;Immutable static vars have an address in memory; constants don't. All mutable static vars are unsafe. unsafe trait Foo,unsafe impl Foo for Bar.uniontypes exist, mainly used for interfacing with C unions, accessing fields is unsafe.- Raw identifier syntax
r#whileallows using e.g. a keyword for an identifier. Useful for FFI and different Rust edition interfacing.
No comments:
Post a Comment