Primitive types
- Integer types are sized, except for
isize
andusize
, whose width is architecture-specific, and corresponds to C++std::ssize_t
/std::intptr_t
andstd::size_t
/std::uintptr_t
respectively. char
!=u8
!=i8
, oh thank god- String literals
"foo"
in UTF-8. They are slices (see below). - Tuples are built into language, use parentheses syntax, roughly correspond to
std::tuple
andstd::pair
. Can also do.i
syntax to access the i-th element. Unit tuple()
. - Fixed-size arrays are built into language, use bracket syntax, roughly correspond
to
std::array
with mandatory bounds-checking. - Simple types implement
Copy
trait (i.e. scalars, correspond to value types in Java) and their assignment copies. Tuples copy if all their elements copy, and this may include arbitrary large tuples. Otherwise, assignment moves. A copy is done by an explicit.clone()
operation. - String slices have type
&str
, can be created with&var[x..y]
syntax,x
and/ory
may be omitted if they are zero and length respectively. Slice consists of a pointer and length. - General slices are very similar. They have
&[element_type]
type. !
is an empty type with no values. Used as return type for functions that never return. It can be coerced into any other type, which is used for i.e.match
arms which e.g.continue
orpanic!
.
Variables & type inference
- Variables (
let
) are immutable by default, but their names can be shadowed, allowing sequences of changes in what a given variable name means. The type can be changed too. Mutable variables can be declared withlet mut
. - Even immutable variables may be declared without initialization.
- Variable types are inferred, but can be annotated when/if needed.
- Constants are declared with
const
, and require type annotations. - Function parameter and return types always must be provided.
Lifetimes, references & borrow checker
- All variables are owned by their enclosing scope. The ownership may be passed
around, when the last owner scope exits, the variable is destroyed (
drop
is executed if theDrop
trait is implemented, memory released). - Passing variable into a function moves it (or copies), transferring its ownership into the function. Likewise returning it moves it (or copies), transferring the return value ownership to the caller.
- Creating references is called borrowing. Immutable references do not allow
modifying the pointed-to variable and are borrowed with
&
, can be passed around without transferring ownership. Several of them may be active at the same time. Mutable references are borrowed with&mut
, and no other mutable or immutable references may exist at the same time. - Each reference has a lifetime. Most cases are handled by implicit lifetimes.
Lifetimes are named by
'a
, usually very short names, placed after&
. - For function signatures, generic lifetime parameters use angle brackets:
fn foo<'a>(x: &'a str, y: &'a str) -> &'a str
- Lifetime annotations in struct definitions limit struct lifetime to that of its fields.
- If there are multiple input lifetime parameters, but one of them is
&self
or&mut self
, its lifetime is assigned to all output lifetime parameters. - Deref coercion converts a reference to a
Deref
-implementing type to a reference to a different type. I.e.&String
to&str
. Happens automatically on parameter-argument type mismatch: from&T
to&U
whenT: Deref<Target=U>
. Internally as many.deref()
calls are inserted as needed. - For mutable references, implement
DerefMut
trait. Two extra deref coercion rules: from&mut T
to&mut U
whenT: DerefMut<Target=U>
and from&mut T
to&U
whenT: Deref<Target=U>
. - The
Drop
trait is the closest thing to a C++ destructor, adds adrop
method that takes a mutable reference toself
. - For structs, field lifetimes are part of the struct type, and should be specified where the struct type is specified.
'static
lifetime is the lifetime of the whole program.- Lifetimes are a type of generics, so a function with both lifetimes and generic type parameters lists both together in angle brackets.
On paper Rust lifetimes appear to be a genius idea. Besides manual resource management and GC, this is a viable third option that combines the advantages of the two while avoiding their disadvantages, althought only partially so.
Google Chrome developers C++ tried this and did not succeed enough for it to be viable: Borrowing Trouble: The Difficulties of a C++ Borrow-Checker.
Statements & expressions
- Last nonterminal symbol in a block may be expression (lacking the final semicolon),
in which case the whole block is an expression with this return value. This makes
return expression;
in functions replaceable withexpression
. This also mergesif
statement and ternary operator to a singleif
expression. loop
starts an infinite loop.break
may return an expression, making the loop an expression. Nested loops may have labels'label: loop {
, then possible to dobreak 'label;
.for
is a range loop. When possible,for
loops seem to be more idiomatic thanwhile
loops.
Type aliases, structs & enums
- Rust type alias
type Foo = ExistingType
is like C++using Foo = ExistingType
. Can be generic:type Foo<T> = std::result::Result<T, std::io::Error>
. - Structures use
struct
keyword, contain only type-annotated fields. - If during a struct variable construction a field and a var it is initialized from have the same name, one of the can be omitted (field init shorthand).
..var
in the struct variable construction takes all the unspecified fields fromvar
of the same struct type.- Tuple structs
struct foo(i64, i64, i64)
are structs that are very similar to tuples. Fields are unnamed. - Unit structs
struct Foo
; - If structs need to store references, then lifetimes have to be used, that's for later.
- Attribute
#[derive(Debug)]
for struct allows doing{:?}
inprintln!
to dump the fields. dbg!(value)
macro maybe inserted as an expression to dumpvalue
- struct may have methods attached to them, in separate
impl StructName
blocks. - The first arg of a method may be one of
&self
, corresponds to a C++const
method;&mut self
, corresponds to a regular C++ method;self
, consumes the object;
- A function in an
impl StructName
not taking aself
is an associated function, not a method, corresponding a C++ static method. Called throughStructName::
syntax. - There may be muliple
impl
blocks. - The simplest Rust enum roughly matches C++ enum class. But then each enum variant
may have different associated data with it, making it similar to
std::variant
- Standard library
Option
enum handles the use cases fornullptr
(a null reference does not exist in Rust). Similar to C++std::optional
.
Closures & function pointers
- Closures:
||
with parameters inside followed by an (optionally bracketed) expression. Parameter and return types are not annotated usually. - Once closure types are inferred, they don't change, cannot call the same closure with different ones.
- Variables are captured by different types of borrowing / ownership taking
implicitly depending on what the code does.
move
keyword before||
forces taking ownership, when the body does not need it implicitly. One use case is passing data to a new thread. - All closures implement
FnOnce
trait, meaning they can be called once. - Closures that mutate captured values but don't move them out implement
FnOnce
andFnMut
. - Closures that don't mutate captured values and don't move them out implement
FnOnce
,FnMut
, andFn
. - All functions coerce to the
fn
type, which is the function pointer type. It implements all ofFnOnce
,FnMut
,Fn
. - To return a closure, use a trait object, e.g.
-> Box<dyn Fn ... >
Generics & traits
- Generics (types) and traits (behavior) resemble C++ templates. Traits also resemble interfaces in other languages.
- Generic type uses must be constrained by traits–no SFINAE. C++ concepts.
impl Foo<f32> {...}
adds implementation for a specific type, similar to C++ template specialization.- Separate traits are implemented for structs in separate blocks:
impl Trait for Struct { ... }
- Traits need to be brought into scope too, pulling in an implementing type is not sufficient to call trait methods.
- Can implement a local trait on an external type or an external trait on an local
type, but not an external trait on an external type (so no C++
std::hash
specialization for astd::
type). This is to avoid allowing multiple trait implementations for the same type. - Trait methods may have default implementations, which may call other, possibly unimplemented methods in the same trait. The default implementation may not be called from an overriding implementation.
- Trait-type parameters without generics syntax:
fn foo(bar: &impl Trait)
. - Trait bounds, using generics syntax:
fn foo<T: Trait>(bar: &T)
. Same as above. - Multiple trait bounds:
fn foo(bar: &(impl Trait1 + Trait2))
andfn foo<T: Trait1 + Trait2>(bar: &T)
- In the case trait bounds become long,
where
clauses pull them aside:
fn foo<T, U>(t: &T, u: &U) -> Result where T: Trait1 + Trait2, U: Trait1 + Trait3, { ... }
- Can use
fn ... -> impl Trait
to return a trait-implementing type, as long as it's a single type. - Can conditionally implement methods for generic structs by adding trait bounds to
their implementation:
impl<T: Trait> Type<T> { ... }
. These are called blanket implementations. - Rust does not have OOP inheritance. Some form is available through default trait method implementations. Dynamic dispatch (C++ virtual methods) is through trait objects.
- A trait object is pointer to an instance of a type and a pointer to a vtable.
- Struct and enum vars in Rust are not objects, trait objects come close, but they cannot contain data.
- Must be a reference (or a smart pointer) to
dyn
trait type, i.e.Box<dyn Trait>
. - Associated types.
type Name
allows to useName
as a type in a trait before its declaration is given by the trait implementors. In C++ one would use template argument dependent typenames. - Default generic type parameters:
<T=DefaultType>
. foo.bar()
can be replaced byType::bar(&foo)
whenbar
is implemented by more than trait to disambiguate. If even more disambiguation is needed, i.e. for associated methods withoutself
parameter,<Type as Trait>::bar
calls the method fromTrait
as implemented forType
.- If a trait depends on another trait, the latter is called a supertrait:
trait Foo: SuperTraitBar
. - Newtype pattern. One use case: implement external traits on external types, declare a new thin wrapper tuple struct. There are other use cases.
Error handling
panic!
macro exits (or aborts, depending on config) on unrecoverable error.- Errors are handled using
Result
enum, which can beOk
orErr
. unwrap
returns the success variant ofResult
or panics.unwrap_or_else
executes given code instead of panicking.expect
is likeunwrap
with a given error message for panicking.?
operator after a call, e.g.let foo = bar()?;
unwraps returnedResult
, or returns from the caller with the error. If the error types do not match,From
trait converts.?
operator works withOption
return types too.
Iterators
- Rust iterators correspond to C++ ranges (or iterator pairs).
- Calling
.iter()
on a collection roughly corresponds to a C++.cbegin()
, except that the latter is not a range. Other options are.into_iter()
to take ownership of values–not sure what a direct C++ mapping would be–and.iter_mut()
over mutable references (C++.begin()
). - Iterators implement the
Iterator
trait. - Iterator
.collect()
method returns a new collection from iterating. - Code using iterator adapters might be faster than equivalent loop-based code. An example of Rust zero-cost abstractions, which of course is found in C++ as well.
Pattern matching & related control flow
- Pattern matching can decompose a tuple to local vars, and do many other things.
match
is a generalizedswitch
with pattern matching, variable binding, and more._
is a catch-all non-binding pattern, like thedefault
inswitch
. Ignores the entire value.if let
behaves like a singlematch
arm, combiningif
with variable binding in the case of true condition.while let
loop repeats until its pattern matches.- The value after
for
keyword in afor
loop is a pattern. let
keyword takes a pattern, not a variable id.- Function parameters are patterns.
- Patterns are refutable and irrefutable, the latter ones matching any possible
passed value. Function parameters,
let
, andfor
take irrefutable patterns.if let
andwhile let
take both kinds, with a compiler warning if irrefutable (as that creates always-true if or an infinite loop while).match
arms must be refutable except for the last one, which should be irrefutable (if the possibilities were not exhausted until then). - Multiple patterns can be combined with
|
. - An inclusive range of values can be matched with
..=
. The range cannot be empty. - Struct destructuring:
Foo { x: a, y: b } = v
getsa
andb
. If field and var names match, thenFoo { x, y } = v
. Literals can be used too. - Enum destructuring
Foo::Variant { x, y }
,Foo::VariantWithNoData
. - Can destructure arbitrarily deep nested structs and enums.
- Nested
_
ignores just that part. - Starting a variable name with an
_
suppressed unused variable warnings for it. ..
is a greedy sequence of_
, i.e.
let numbers = (1, 5, 7, 20, 30); match numbers { (first, .., last) => ... }
match
arms may have match guards which are extraif
conditions that can use the bound vars.Some(x) if x > 5
. Exhaustiveness is not checked.- @ bindings allow to create a var holding the tested value at the match time, i.e.
Message::Hello { id: id_var @ 3..=7 }
.
Standard library types
- Dynamic strings:
String
, would correspond tostd::string
type, but UTF-8.Display
trait addsto_string
method. Not indexable to avoid byte/UTF-8 encoding mixup. Slicing is allowed but runtime-checked to fall on char boundary. To disambiguate byte/char interpretation, use.chars()
or.bytes()
. - Standard library vectors match
std::vector
. A macrovec![1, 2, 3]
to create a vector with given contents. Ownership/borrowing rules apply to whole vector, i.e. if a mutable reference to the first element is taken, a new one cannot be pushed to the back. - Standard library hash maps correspond to
std::unordered_map
. std::thread::spawn(closure)
->std::thread::thread(callable)
.
Smart pointers & dynamically-sized (unsized) types
- Smart pointers own data. They implement
Deref
andDrop
traits.String
andVec<T>
are smart pointers. Box<T>
is likestd::unique_ptr<T>
in C++, except that Rust is more likely to use plain references and lifetimes, so no 1:1 mapping in i.e. rewrite.Box::new
isstd::make_unique
.- Implementing
Deref
trait enables dereferencing with the*
operator, like overloading C++*
and->
operators does. - Under the hood
*x
is transformed to*(x.deref())
exactly once. std::mem::drop
corresponds to C++std::unique_ptr::reset
or other early destruction.Rc<T>
matchesstd::shared_ptr
.Rc::clone
method matchesstd::shared_ptr
copy constructor.- Interior mutability pattern:
unsafe
code to mutate data inside an immutable value even with immutable references present. RefCell<T>
does borrow checking at runtime instead of compile time.borrow
andborrow_mut
methods.Rc<RefCell<T>>
pattern implements multiple owners to potentially-mutable data.Weak<T>
matchesstd::weak_ptr
. Constructed byRc::downgrade
. Upgraded byupgrade
method, corresponding tostd::weak_ptr::lock
.- Dynamically sized types (DST) or unsized types, whose sizes are only known at the
runtime. Cannot create variables of such types directly, naturally always hidden in
some pointer + size structure. Rust automatically implements
Sized
trait for every non-DST, and implicitly bounds by it for every generic function. To relax the latter,fn foo<T: ?Sized>...
. The?Trait
syntax is only available forSized
trait.
Operator overloading
- Operator overloading by implementing the desired traits in
std::ops
.
Concurrency
thread::spawn
returns aJoinHandle
, which has ajoin
method, similar to C++std::thread::join
.- Message passing for inter-thread communication, like in Go. Channels,
std::sync::mpsc::channel()
. The endpoints havesend
,recv
,try_recv
methods. The receiver implementsIterator
too. The channels may have multiple transmitters (it's MPSC), which can be created by.clone
. - Messages must implement
Send
trait. If a type is composed ofSend
types only, it becomesSend
automatically. - Types whose variables are safe to be referenced from multiple threads implement
Sync
trait. If&T
isSend
, thenT
isSync
. A type made ofSync
types only isSync
automatically. Mutex<T>
is a mutex-guarded variable ofT
..lock()
returns aLockResult
, which has a (potentially mutable) referenceMutexGuard
to the guarded data. The guard unlocks when it goes out of scope. Mutex implements interior mutability.Arc<T>
corresponds to C++std::atomic<std::shared_ptr>
.- To actually share mutexes between threads, wrap them:
Arc<Mutex<T>>
.
Assorted standard library functionality
std::env::args
is forint argc, char *argv[]
. It's Unicode-checking, if that hurts thenstd::env::args_os
.std::process::exit
is forexit
std::env::var
is forgetenv
println!
prints tostdout
,eprintln!
tostderr
.
Build and dependency management
- cargo seems to be a much better story than CMake hell or its alternatives.
- A crate is the smallest compilation unit, either a library crate, or a binary
crate. Usually means the former. A crate root is the starting source file in it. A
package is a bundle of crates with at most one library crate. Standard paths inside
a package:
src/main.rs
,src/lib.rs
,src/bin
. - Release profiles correspond to a mixture of CMake build configurations,
NDEBUG
define, etc. in C++.dev
profile corresponds toDebug
, andrelease~to ~Release
(orRelWithDebInfo
?). - Can customize profiles in
Cargo.toml
, i.e. optimization levels. - Workspaces organize related packages together in large projects, to share directory
root,
Cargo.toml
, andCargo.lock
.
Modules
Not familiar enough with C++ modules to compare.
- Modules (and submodules) inside a crate do namespaceing and public/private.
src/modulename.rs
,src/modulename/submodulename.rs
. Modules can be private or public, declared withpub mod
andmod
. super::
as a part of name path goes one level up.use
keyword imports. Idiomatically functions are imported through their parent module, everything else directly.use ... as ...
creates name alias.pub use
re-exports. Used to organize and collect public API from several potentially nested submodules.- Nested path syntax:
use foo::{bar, baz, self};
, globsuse foo::*;
Tooling
rustfmt
formats, so doesclang-format
.Clippy
the linter.rust-analyzer
for LSP support.
Documentation
- Documentation header comments start with
///
and support Markdown. Built bycargo doc [--open]
. - Typical API doc sections: Examples, Panics, Errors, Safety.
- Contained documentation comments start with
//!
, typically used for crates and modules.
Testing & benchmarking
#[test]
annotates a function to be a test, i.e. Google TestTEST
macro in C++. Tests run in parallel by default.assert_eq!
is like gtestASSERT_EQ
, except that the args are 'left' and 'right' instead of 'expected' and 'actual' or similar. Equality asserts may be applied on types implementingPartialEq
andDebug
traits.assert!
may take 2nd and subsequent args for a message in the case of failure.- Tests annotated with
#[should_panic]
test that the annotated function panics, similar but not identical to gtest death tests. Best to addexpected
parameter to the attribute to specify the reason for panic. - Tests may also be implemented by returning a
Result<T, E>
. - Benchmark tests correspond to Google Benchmark, but unstable ATM.
- Documentation tests can compile API examples automatically.
- Unit tests go with the code they test,
mod tests
annotated with#[cfg(test)]
- Visibility rules happen to allow the testing of private functions.
- Integration tests go to a top-level
tests
directory, no configuration annotation. Each file there is a separate (test) crate–if that's not what's needed, i.e. for setup code, usefoo/mod.rs
naming convention for non-tests. cargo test
runs in sequence: unit, integration, doc, does not go to the next category if failure.- Binary crates cannot have integration tests directly. The usual thing to do is to always have a library crate with a binary crate as minimal as possible.
Macros
- There are macros, names trailing with exclamation mark (
println!
). - Macros can take Rust code and expand to a different Rust code. A difference from C++ preprocessor that it works on the AST, not textually. While powerful, how well does this work with tooling? Do they run macros? Can they refactor macros?
- Declarative macros (
macro_rules!
) pattern-match given code to produce code. #[macro_export]
annotation for public macros.- Procedural macros take token stream input and produce token stream output.
- One kind is custom
derive
macros that add code to a struct implementation. - Attribute-like macros allow creating new attributes.
- Function-like macros are close to C preprocessor function-like macros, except that
they also operate on
TokenStream
and not on arguments directly. Can take variable number of arguments.
Unsafe Rust & FFI
unsafe { ... }
: allows some, well, unsafe featuresunsafe
can dereference raw pointers*const T
,*mut T
. Raw pointers are just like C raw pointers.unsafe fn foo() {}
, thenfn
can be called from unsafe code.extern "C" { fn putenv ... }
for FFI, may only be called from unsafe code.- To make Rust function callable from external code, add
#[no_mangle]
annotation andpub extern "C"
before thefn
. - Static variables may be declared with
static FOO_BAR: type = value;
Immutable static vars have an address in memory; constants don't. All mutable static vars are unsafe. unsafe trait Foo
,unsafe impl Foo for Bar
.union
types exist, mainly used for interfacing with C unions, accessing fields is unsafe.- Raw identifier syntax
r#while
allows using e.g. a keyword for an identifier. Useful for FFI and different Rust edition interfacing.
No comments:
Post a Comment