profile
viewpoint
bluss bluss Low Earth Orbit Debris Field https://github.com/bluss/rustfaq I work on Rust, I teach Rust.

bluss/arrayvec 277

A vector with a fixed capacity. (Rust)

bluss/bencher 40

bencher is just a port of the libtest (unstable) benchmark runner to Rust stable releases. `cargo bench` on stable. "Not a better bencher!" = No feature development. Go build a better stable benchmarking library.

bluss/defmac 12

A macro to define lambda-like macros inline.

bluss/debugit 8

NOTE: Crate dbg mostly supplants this one; DebugIt: Use specialization so you can debug-print a value without explicit (and viral) Debug trait bound.

bluss/asprim 2

Rust library with a trait for the `as` operator for primitive numeric types.

bluss/blake2-ppc-altivec 2

BLAKE2s hash function https://blake2.net PowerPC/Altivec implementation by Ulrik Sverdrup

bluss/blis-sys 2

[Feel free to adopt/fork this repo!] Experimental Rust bindings for BLIS

bluss/aeon 1

Aeon is a gradually typed, interpreted programming language that combines the flexibility of a dynamic language with the safety of a static language.

bluss/bmap 1

Experimental B-tree map in Rust using ArrayVec

pull request commentbluss/indexmap

Update the README for the hashbrown-based implementation

Thanks for all your work on this feature :) and the release

cuviper

comment created time in 18 days

startedfrankmcsherry/columnation

started time in a month

issue commentbluss/indexmap

Possible changes for 2.0

I just have an attitude problem with ahash, would be more charmed by a pure-rust hash function.

But for an actual argument, for a public dependency we need them to have an 1.x version or equivalent.

cuviper

comment created time in a month

issue commentrust-ndarray/ndarray

Cannot use some binary operators "foo_assign" on &mut ArrayBase

You are right, making a new release and updating num-complex should be a priority

Ionizing

comment created time in a month

issue commentbluss/indexmap

An insert_sorted() method

insert_sorted is still O(n) so I'm not sure it's good enough.

Schmeckinger

comment created time in a month

pull request commentbluss/indexmap

Switch to hashbrown's RawTable internally

I'd vote to merge this and go for indexmap 1.5 with this

cuviper

comment created time in a month

issue commentrust-ndarray/ndarray

Help with R language integration

ndarray does not at the moment support creating custom owned arrays like that unfortunately, but using array views is possible.

andy-thomason

comment created time in a month

issue commentrust-ndarray/ndarray

Need .axis_iter()-analog taking K axes and returning NdProducer of (D-K)-dim ArrayViews

For it to not be too slow, I think it needs to be implemented with specific dimension types (and not IxDyn/dynamic dim).

tschebotarev

comment created time in a month

issue commentbluss/indexmap

Possible changes for 2.0

Added item for a proper std crate feature. Makes it easier for deps

cuviper

comment created time in a month

pull request commentbluss/indexmap

Implement direct usize indexing

That makes sense but

ith Index<Idx> for IndexMap<K, V, S, Idx>, but I can't prove that Index<Idx> and Index<&Q> are distinct

Won't we have a trait bound on Idx? If that's a local trait it should be enough, it will prove that no references implement it? (Or maybe that doesn't work because references are fundamental)

cuviper

comment created time in a month

pull request commentbluss/arrayvec

Make new functions const (gated by feature="const").

Nice that this can be done. How far are we from that this is stable? The trait bounds are a problem aren't they?

I'm not sure we can commit to having unstable features in the crate, but if we could the following remains:

  • I'd suggest the feature name unstable-const-fn
  • The crate feature must be documented (as unstable) at the top of lib.rs with the other features.
  • The pull request updated with a description

🙂

m-ou-se

comment created time in a month

pull request commentbluss/indexmap

Switch to hashbrown's RawTable internally

I can't really think of any breaking changes that are on the wishlist.

I definitely don't want to pile on work here, but a 2.0 version could include the parameterization by index type. Then we can update the Default impl to be fully generic without having to fear type inference regressions (and the other new* methods too?).

The "experimental things - Equivalent trait works well, the MutableKeys trait I don't know, so those things seem like they could stay for 2.0.

Which method to make the default .remove() - I still think the current decision is fine, no change from my end for 2.0. (Good way to document this would be to say that "remove is the fastest way to remove an element, and if order should be preserved, ...")

cuviper

comment created time in a month

issue openedinteger32llc/rust-playground

"const fn main" not detected as main function.

Feature request: detect "const fn main" as a main function. 😈

The following program says it doesn't detect a main function:

const fn main() -> Result<(), ()> {
    Err(())
}

But if you fool it, it runs:

/*
fn main() */
const fn main() -> Result<(), ()> {
    Err(())
}
   Compiling playground v0.0.1 (/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 0.44s
     Running `target/debug/playground`
Error: ()

created time in a month

issue commentbluss/arrayvec

Support for constant generics

@Michael-F-Bryan Yes, I think we should, if it makes sense. There might be other crates already filling this need?

c410-f3r

comment created time in a month

pull request commentbluss/indexmap

Switch to hashbrown's RawTable internally

I agree

cuviper

comment created time in a month

pull request commentbluss/indexmap

Switch to hashbrown's RawTable internally

I don't think it's wrong to carefully raise the MSRV, it's the plan we have documented and promised since 1.0. However we can discuss it with those that depend on us.

cuviper

comment created time in a month

pull request commentbluss/indexmap

Implement direct usize indexing

I think we should just wait with this, in that case. If we have an indexing operator, IMO it must use the index type that we have parameterized the map with.

cuviper

comment created time in a month

pull request commentbluss/indexmap

Demonstrate fnv and fxhash as alternate hashers

I'll just approve and you can merge - don't want to cause conflicts with other prs, if there are any conflicts.

cuviper

comment created time in a month

pull request commentbluss/indexmap

Demonstrate fnv and fxhash as alternate hashers

Thanks

cuviper

comment created time in a month

push eventcuviper/indexmap

bluss

commit sha 291ebbc730c329188254fe302f0a08639b666454

Make methods that get RawBucket parameters unsafe; add safety comments These methods trust their caller to pass correct RawBucket values, so we mark them unsafe to use the common safe/unsafe distinction. I used allow(unused_unsafe) to write the functions in the (hopefully) future style of internal unsafe blocks in unsafe functions.

view details

push time in a month

pull request commentbluss/indexmap

Switch to hashbrown's RawTable internally

This might be the first time we raise the MSRV in 1.x, but with a good reason.

It looks like we might break serde_json, which builds CI with Rust 1.31? cc @dtolnay, when we raise ours to Rust 1.32 here.

Other top rdeps - toml, http, petgraph, they look fine from MSRV standpoint.

cuviper

comment created time in a month

pull request commentbluss/indexmap

Switch to hashbrown's RawTable internally

Thanks all for your input on the discussion, and helping me find some solid footing again with the raw pointer code. Thanks cuviper for working on this - I'll work on helping you wrap this up as soon as you want.

There are two discomforting things left; I think it's something that's rather general to come up in this situation

  • We handle raw buckets (raw pointer wrappers) that are passed in structs and between function scopes

    • Usually in Rust, we use safe abstractions, maybe just a simple lifetime to help us with the book keeping when leaving local reasoning. Lifetimes often help us convert the problem into just using local reasoning instead of global reasoning. I'm not saying it's worth the trouble to wrap this in further abstractions here, to make it safer, but it's the things we try to do if it's possible.
  • We have internal functions that don't follow the usual safe/unsafe distinction

    • I believe it is best practice, that for example fn swap_remove_bucket(&mut self, raw_bucket: RawBucket) -> (usize, K, V), should be an unsafe method. The reason is that we must trust the caller to pass a correct raw bucket. I'll push a commit with this change. History is mutable, so feel free to toss it out if you don't like it.

    • We end up in a gray area here, since struct fields can not be made unsafe to assign (known issue), so we don't follow this completely stringently.

cuviper

comment created time in a month

push eventcuviper/indexmap

bluss

commit sha 602c7eed2c7fcd8a1217cf7969d6bb7145a08bad

Make methods that get RawTable parameters unsafe; add safety comments These methods trust their caller to pass correct RawBucket values, so we mark them unsafe to use the common safe/unsafe distinction. I used allow(unused_unsafe) to write the functions in the (hopefully) future style of internal unsafe blocks in unsafe functions.

view details

push time in a month

Pull request review commentbluss/indexmap

Switch to hashbrown's RawTable internally

 impl<K, V> Entries for IndexMapCore<K, V> {     where         F: FnOnce(&mut [Self::Entry]),     {-        let side_index = self.save_hash_index();         f(&mut self.entries);-        self.restore_hash_index(side_index);+        self.rebuild_hash_table();     } }  impl<K, V> IndexMapCore<K, V> {     #[inline]     pub(crate) fn new() -> Self {         IndexMapCore {-            mask: 0,-            indices: Box::new([]),+            indices: RawTable::new(),             entries: Vec::new(),         }     }      #[inline]     pub(crate) fn with_capacity(n: usize) -> Self {-        let raw = to_raw_capacity(n);-        let raw_cap = max(raw.next_power_of_two(), 8);         IndexMapCore {-            mask: raw_cap.wrapping_sub(1),-            indices: vec![Pos::none(); raw_cap].into_boxed_slice(),-            entries: Vec::with_capacity(usable_capacity(raw_cap)),+            indices: RawTable::with_capacity(n),+            entries: Vec::with_capacity(n),         }     } -    // Return whether we need 32 or 64 bits to specify a bucket or entry index-    #[cfg(not(feature = "test_low_transition_point"))]-    fn size_class_is_64bit(&self) -> bool {-        usize::max_value() > u32::max_value() as usize-            && self.raw_capacity() >= u32::max_value() as usize-    }--    // for testing-    #[cfg(feature = "test_low_transition_point")]-    fn size_class_is_64bit(&self) -> bool {-        self.raw_capacity() >= 64-    }--    #[inline(always)]-    fn raw_capacity(&self) -> usize {-        self.indices.len()-    }-+    #[inline]     pub(crate) fn len(&self) -> usize {-        self.entries.len()+        self.indices.len()     } +    #[inline]     pub(crate) fn capacity(&self) -> usize {-        usable_capacity(self.raw_capacity())+        cmp::min(self.indices.capacity(), self.entries.capacity())     }      pub(crate) fn clear(&mut self) {+        self.indices.clear_no_drop();         self.entries.clear();-        self.clear_indices();     }      pub(crate) fn drain(&mut self, range: RangeFull) -> Drain<Bucket<K, V>> {-        self.clear_indices();+        self.indices.clear_no_drop();         self.entries.drain(range)     } -    // clear self.indices to the same state as "no elements"-    fn clear_indices(&mut self) {-        for pos in self.indices.iter_mut() {-            *pos = Pos::none();-        }+    /// Reserve capacity for `additional` more key-value pairs.+    pub(crate) fn reserve(&mut self, additional: usize) {+        self.indices.reserve(additional, get_hash(&self.entries));+        self.reserve_entries();     } -    fn first_allocation(&mut self) {-        debug_assert_eq!(self.len(), 0);-        let raw_cap = 8usize;-        self.mask = raw_cap.wrapping_sub(1);-        self.indices = vec![Pos::none(); raw_cap].into_boxed_slice();-        self.entries = Vec::with_capacity(usable_capacity(raw_cap));+    /// Reserve entries capacity to match the indices+    fn reserve_entries(&mut self) {+        let additional = self.indices.capacity() - self.entries.len();+        self.entries.reserve_exact(additional);     } -    pub(crate) fn reserve_one(&mut self) {-        if self.len() == self.capacity() {-            dispatch_32_vs_64!(self.double_capacity());-        }+    /// Shrink the capacity of the map as much as possible.+    pub(crate) fn shrink_to_fit(&mut self) {+        self.indices.shrink_to(0, get_hash(&self.entries));+        self.entries.shrink_to_fit();     } -    #[inline(never)]-    // `Sz` is *current* Size class, before grow-    fn double_capacity<Sz>(&mut self)-    where-        Sz: Size,-    {-        debug_assert!(self.raw_capacity() == 0 || self.len() > 0);-        if self.raw_capacity() == 0 {-            return self.first_allocation();-        }--        // find first ideally placed element -- start of cluster-        let mut first_ideal = 0;-        for (i, index) in enumerate(&*self.indices) {-            if let Some(pos) = index.pos() {-                if 0 == probe_distance(self.mask, self.entries[pos].hash, i) {-                    first_ideal = i;-                    break;-                }-            }-        }--        // visit the entries in an order where we can simply reinsert them-        // into self.indices without any bucket stealing.-        let new_raw_cap = self.indices.len() * 2;-        let old_indices = replace(-            &mut self.indices,-            vec![Pos::none(); new_raw_cap].into_boxed_slice(),-        );-        self.mask = new_raw_cap.wrapping_sub(1);--        // `Sz` is the old size class, and either u32 or u64 is the new-        for &pos in &old_indices[first_ideal..] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));-        }--        for &pos in &old_indices[..first_ideal] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));+    /// Remove the last key-value pair+    pub(crate) fn pop(&mut self) -> Option<(K, V)> {+        if let Some(entry) = self.entries.pop() {+            let last = self.entries.len();+            let raw_bucket = self.find_index(entry.hash, last).unwrap();+            unsafe { self.indices.erase_no_drop(&raw_bucket) };+            Some((entry.key, entry.value))+        } else {+            None         }-        let more = self.capacity() - self.len();-        self.entries.reserve_exact(more);     } -    // write to self.indices-    // read from self.entries at `pos`-    //-    // reinserting rewrites all `Pos` entries anyway. This handles transitioning-    // from u32 to u64 size class if needed by using the two type parameters.-    fn reinsert_entry_in_order<SzNew, SzOld>(&mut self, pos: Pos)-    where-        SzNew: Size,-        SzOld: Size,-    {-        if let Some((i, hash_proxy)) = pos.resolve::<SzOld>() {-            // only if the size class is conserved can we use the short hash-            let entry_hash = if SzOld::is_same_size::<SzNew>() {-                hash_proxy.get_short_hash(&self.entries, i).into_hash()-            } else {-                self.entries[i].hash-            };-            // find first empty bucket and insert there-            let mut probe = desired_pos(self.mask, entry_hash);-            probe_loop!(probe < self.indices.len(), {-                if self.indices[probe].is_none() {-                    // empty bucket, insert here-                    self.indices[probe] = Pos::with_hash::<SzNew>(i, entry_hash);-                    return;-                }-            });+    /// Append a key-value pair, *without* checking whether it already exists.+    fn push(&mut self, hash: HashValue, key: K, value: V) -> usize {+        let i = self.entries.len();+        self.indices.insert(hash.get(), i, get_hash(&self.entries));+        if i == self.entries.capacity() {+            // Reserve our own capacity synced to the indices,+            // rather than letting `Vec::push` just double it.+            self.reserve_entries();         }+        self.entries.push(Bucket { hash, key, value });+        i     } -    pub(crate) fn pop_impl(&mut self) -> Option<(K, V)> {-        let (probe, found) = match self.as_entries().last() {-            Some(e) => self.find_existing_entry(e),-            None => return None,-        };-        debug_assert_eq!(found, self.entries.len() - 1);-        Some(self.swap_remove_found(probe, found))-    }--    fn insert_phase_1<'a, Sz, A>(&'a mut self, hash: HashValue, key: K, action: A) -> A::Output+    /// Return the index in `entries` where an equivalent key can be found+    pub(crate) fn get_index_of<Q>(&self, hash: HashValue, key: &Q) -> Option<usize>     where-        Sz: Size,-        K: Eq,-        A: ProbeAction<'a, Sz, K, V>,-    {-        let mut probe = desired_pos(self.mask, hash);-        let mut dist = 0;-        debug_assert!(self.len() < self.raw_capacity());-        probe_loop!(probe < self.indices.len(), {-            if let Some((i, hash_proxy)) = self.indices[probe].resolve::<Sz>() {-                let entry_hash = hash_proxy.get_short_hash(&self.entries, i);-                // if existing element probed less than us, swap-                let their_dist = probe_distance(self.mask, entry_hash.into_hash(), probe);-                if their_dist < dist {-                    // robin hood: steal the spot if it's better for us-                    return action.steal(VacantEntry {-                        map: self,-                        hash: hash,-                        key: key,-                        probe: probe,-                    });-                } else if entry_hash == hash && self.entries[i].key == key {-                    return action.hit(OccupiedEntry {-                        map: self,-                        key: key,-                        probe: probe,-                        index: i,-                    });-                }-            } else {-                // empty bucket, insert here-                return action.empty(VacantEntry {-                    map: self,-                    hash: hash,-                    key: key,-                    probe: probe,-                });-            }-            dist += 1;-        });-    }--    /// phase 2 is post-insert where we forward-shift `Pos` in the indices.-    fn insert_phase_2<Sz>(&mut self, mut probe: usize, mut old_pos: Pos)-    where-        Sz: Size,+        Q: ?Sized + Equivalent<K>,     {-        probe_loop!(probe < self.indices.len(), {-            let pos = &mut self.indices[probe];-            if pos.is_none() {-                *pos = old_pos;-                break;-            } else {-                old_pos = replace(pos, old_pos);-            }-        });+        match self.find_equivalent(hash, key) {+            Some(raw_bucket) => Some(unsafe { raw_bucket.read() }),+            None => None,+        }     }      pub(crate) fn insert_full(&mut self, hash: HashValue, key: K, value: V) -> (usize, Option<V>)     where         K: Eq,     {-        self.reserve_one();-        dispatch_32_vs_64!(self.insert_phase_1::<_>(hash, key, InsertValue(value)))+        match self.get_index_of(hash, &key) {+            Some(i) => (i, Some(replace(&mut self.entries[i].value, value))),+            None => (self.push(hash, key, value), None),+        }     }      pub(crate) fn entry(&mut self, hash: HashValue, key: K) -> Entry<K, V>     where         K: Eq,     {-        self.reserve_one();-        dispatch_32_vs_64!(self.insert_phase_1::<_>(hash, key, MakeEntry))-    }--    /// Return probe (indices) and position (entries)-    pub(crate) fn find_using<F>(&self, hash: HashValue, key_eq: F) -> Option<(usize, usize)>+        match self.find_equivalent(hash, &key) {+            Some(raw_bucket) => Entry::Occupied(OccupiedEntry {+                map: self,+                raw_bucket,

For pedagogical consideration - for the rest of the community, I mean - I'll mention that I think there is a strong will to have &mut self/&self be meaningful, and we are sometimes thinking of this on a Rust conceptual level, not just loading a pointer through a pointer, but want to assign a meaning to accessing through &mut self vs &self.

cuviper

comment created time in a month

Pull request review commentbluss/indexmap

Switch to hashbrown's RawTable internally

 impl<K, V> Entries for IndexMapCore<K, V> {     where         F: FnOnce(&mut [Self::Entry]),     {-        let side_index = self.save_hash_index();         f(&mut self.entries);-        self.restore_hash_index(side_index);+        self.rebuild_hash_table();     } }  impl<K, V> IndexMapCore<K, V> {     #[inline]     pub(crate) fn new() -> Self {         IndexMapCore {-            mask: 0,-            indices: Box::new([]),+            indices: RawTable::new(),             entries: Vec::new(),         }     }      #[inline]     pub(crate) fn with_capacity(n: usize) -> Self {-        let raw = to_raw_capacity(n);-        let raw_cap = max(raw.next_power_of_two(), 8);         IndexMapCore {-            mask: raw_cap.wrapping_sub(1),-            indices: vec![Pos::none(); raw_cap].into_boxed_slice(),-            entries: Vec::with_capacity(usable_capacity(raw_cap)),+            indices: RawTable::with_capacity(n),+            entries: Vec::with_capacity(n),         }     } -    // Return whether we need 32 or 64 bits to specify a bucket or entry index-    #[cfg(not(feature = "test_low_transition_point"))]-    fn size_class_is_64bit(&self) -> bool {-        usize::max_value() > u32::max_value() as usize-            && self.raw_capacity() >= u32::max_value() as usize-    }--    // for testing-    #[cfg(feature = "test_low_transition_point")]-    fn size_class_is_64bit(&self) -> bool {-        self.raw_capacity() >= 64-    }--    #[inline(always)]-    fn raw_capacity(&self) -> usize {-        self.indices.len()-    }-+    #[inline]     pub(crate) fn len(&self) -> usize {-        self.entries.len()+        self.indices.len()     } +    #[inline]     pub(crate) fn capacity(&self) -> usize {-        usable_capacity(self.raw_capacity())+        cmp::min(self.indices.capacity(), self.entries.capacity())     }      pub(crate) fn clear(&mut self) {+        self.indices.clear_no_drop();         self.entries.clear();-        self.clear_indices();     }      pub(crate) fn drain(&mut self, range: RangeFull) -> Drain<Bucket<K, V>> {-        self.clear_indices();+        self.indices.clear_no_drop();         self.entries.drain(range)     } -    // clear self.indices to the same state as "no elements"-    fn clear_indices(&mut self) {-        for pos in self.indices.iter_mut() {-            *pos = Pos::none();-        }+    /// Reserve capacity for `additional` more key-value pairs.+    pub(crate) fn reserve(&mut self, additional: usize) {+        self.indices.reserve(additional, get_hash(&self.entries));+        self.reserve_entries();     } -    fn first_allocation(&mut self) {-        debug_assert_eq!(self.len(), 0);-        let raw_cap = 8usize;-        self.mask = raw_cap.wrapping_sub(1);-        self.indices = vec![Pos::none(); raw_cap].into_boxed_slice();-        self.entries = Vec::with_capacity(usable_capacity(raw_cap));+    /// Reserve entries capacity to match the indices+    fn reserve_entries(&mut self) {+        let additional = self.indices.capacity() - self.entries.len();+        self.entries.reserve_exact(additional);     } -    pub(crate) fn reserve_one(&mut self) {-        if self.len() == self.capacity() {-            dispatch_32_vs_64!(self.double_capacity());-        }+    /// Shrink the capacity of the map as much as possible.+    pub(crate) fn shrink_to_fit(&mut self) {+        self.indices.shrink_to(0, get_hash(&self.entries));+        self.entries.shrink_to_fit();     } -    #[inline(never)]-    // `Sz` is *current* Size class, before grow-    fn double_capacity<Sz>(&mut self)-    where-        Sz: Size,-    {-        debug_assert!(self.raw_capacity() == 0 || self.len() > 0);-        if self.raw_capacity() == 0 {-            return self.first_allocation();-        }--        // find first ideally placed element -- start of cluster-        let mut first_ideal = 0;-        for (i, index) in enumerate(&*self.indices) {-            if let Some(pos) = index.pos() {-                if 0 == probe_distance(self.mask, self.entries[pos].hash, i) {-                    first_ideal = i;-                    break;-                }-            }-        }--        // visit the entries in an order where we can simply reinsert them-        // into self.indices without any bucket stealing.-        let new_raw_cap = self.indices.len() * 2;-        let old_indices = replace(-            &mut self.indices,-            vec![Pos::none(); new_raw_cap].into_boxed_slice(),-        );-        self.mask = new_raw_cap.wrapping_sub(1);--        // `Sz` is the old size class, and either u32 or u64 is the new-        for &pos in &old_indices[first_ideal..] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));-        }--        for &pos in &old_indices[..first_ideal] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));+    /// Remove the last key-value pair+    pub(crate) fn pop(&mut self) -> Option<(K, V)> {+        if let Some(entry) = self.entries.pop() {+            let last = self.entries.len();+            let raw_bucket = self.find_index(entry.hash, last).unwrap();+            unsafe { self.indices.erase_no_drop(&raw_bucket) };+            Some((entry.key, entry.value))+        } else {+            None         }-        let more = self.capacity() - self.len();-        self.entries.reserve_exact(more);     } -    // write to self.indices-    // read from self.entries at `pos`-    //-    // reinserting rewrites all `Pos` entries anyway. This handles transitioning-    // from u32 to u64 size class if needed by using the two type parameters.-    fn reinsert_entry_in_order<SzNew, SzOld>(&mut self, pos: Pos)-    where-        SzNew: Size,-        SzOld: Size,-    {-        if let Some((i, hash_proxy)) = pos.resolve::<SzOld>() {-            // only if the size class is conserved can we use the short hash-            let entry_hash = if SzOld::is_same_size::<SzNew>() {-                hash_proxy.get_short_hash(&self.entries, i).into_hash()-            } else {-                self.entries[i].hash-            };-            // find first empty bucket and insert there-            let mut probe = desired_pos(self.mask, entry_hash);-            probe_loop!(probe < self.indices.len(), {-                if self.indices[probe].is_none() {-                    // empty bucket, insert here-                    self.indices[probe] = Pos::with_hash::<SzNew>(i, entry_hash);-                    return;-                }-            });+    /// Append a key-value pair, *without* checking whether it already exists.+    fn push(&mut self, hash: HashValue, key: K, value: V) -> usize {+        let i = self.entries.len();+        self.indices.insert(hash.get(), i, get_hash(&self.entries));+        if i == self.entries.capacity() {+            // Reserve our own capacity synced to the indices,+            // rather than letting `Vec::push` just double it.+            self.reserve_entries();         }+        self.entries.push(Bucket { hash, key, value });+        i     } -    pub(crate) fn pop_impl(&mut self) -> Option<(K, V)> {-        let (probe, found) = match self.as_entries().last() {-            Some(e) => self.find_existing_entry(e),-            None => return None,-        };-        debug_assert_eq!(found, self.entries.len() - 1);-        Some(self.swap_remove_found(probe, found))-    }--    fn insert_phase_1<'a, Sz, A>(&'a mut self, hash: HashValue, key: K, action: A) -> A::Output+    /// Return the index in `entries` where an equivalent key can be found+    pub(crate) fn get_index_of<Q>(&self, hash: HashValue, key: &Q) -> Option<usize>     where-        Sz: Size,-        K: Eq,-        A: ProbeAction<'a, Sz, K, V>,-    {-        let mut probe = desired_pos(self.mask, hash);-        let mut dist = 0;-        debug_assert!(self.len() < self.raw_capacity());-        probe_loop!(probe < self.indices.len(), {-            if let Some((i, hash_proxy)) = self.indices[probe].resolve::<Sz>() {-                let entry_hash = hash_proxy.get_short_hash(&self.entries, i);-                // if existing element probed less than us, swap-                let their_dist = probe_distance(self.mask, entry_hash.into_hash(), probe);-                if their_dist < dist {-                    // robin hood: steal the spot if it's better for us-                    return action.steal(VacantEntry {-                        map: self,-                        hash: hash,-                        key: key,-                        probe: probe,-                    });-                } else if entry_hash == hash && self.entries[i].key == key {-                    return action.hit(OccupiedEntry {-                        map: self,-                        key: key,-                        probe: probe,-                        index: i,-                    });-                }-            } else {-                // empty bucket, insert here-                return action.empty(VacantEntry {-                    map: self,-                    hash: hash,-                    key: key,-                    probe: probe,-                });-            }-            dist += 1;-        });-    }--    /// phase 2 is post-insert where we forward-shift `Pos` in the indices.-    fn insert_phase_2<Sz>(&mut self, mut probe: usize, mut old_pos: Pos)-    where-        Sz: Size,+        Q: ?Sized + Equivalent<K>,     {-        probe_loop!(probe < self.indices.len(), {-            let pos = &mut self.indices[probe];-            if pos.is_none() {-                *pos = old_pos;-                break;-            } else {-                old_pos = replace(pos, old_pos);-            }-        });+        match self.find_equivalent(hash, key) {+            Some(raw_bucket) => Some(unsafe { raw_bucket.read() }),+            None => None,+        }     }      pub(crate) fn insert_full(&mut self, hash: HashValue, key: K, value: V) -> (usize, Option<V>)     where         K: Eq,     {-        self.reserve_one();-        dispatch_32_vs_64!(self.insert_phase_1::<_>(hash, key, InsertValue(value)))+        match self.get_index_of(hash, &key) {+            Some(i) => (i, Some(replace(&mut self.entries[i].value, value))),+            None => (self.push(hash, key, value), None),+        }     }      pub(crate) fn entry(&mut self, hash: HashValue, key: K) -> Entry<K, V>     where         K: Eq,     {-        self.reserve_one();-        dispatch_32_vs_64!(self.insert_phase_1::<_>(hash, key, MakeEntry))-    }--    /// Return probe (indices) and position (entries)-    pub(crate) fn find_using<F>(&self, hash: HashValue, key_eq: F) -> Option<(usize, usize)>+        match self.find_equivalent(hash, &key) {+            Some(raw_bucket) => Entry::Occupied(OccupiedEntry {+                map: self,+                raw_bucket,

I don't think the notion of me being confused is fair, when this is a whole bundle of unspecified and not yet formalized rules. From my perspective we're hanging on tight and trying to adapt unsafe code to what the future rules will be, based on ongoing discussion.

cuviper

comment created time in a month

pull request commentbluss/indexmap

Additional operations for manipulating iteration order

It makes sense to me!

  • Option is good for found/not found. Documentation needs to be in tune with the choice of Option or Result. if it's called Success/Failure in doc, then it must use Result. Maybe it should be reformulated.
  • Indexing errors (out of bounds) normally panic, don't use Result on errors in this crate
  • The name "reorder" doesn't convey "pick up from where it was and put in a new location" to me. Maybe the methods could have names that use "move" or "put"?

The hashbrown PR will have priority for now, I fear this PR needs to be rebased again after that is merged.

allen-marshall

comment created time in a month

Pull request review commentbluss/indexmap

Additional operations for manipulating iteration order

 impl<K, V> OrderMapCore<K, V> {         (probe, actual_pos)     } +    /// Update self.indices to reflect a change in the ordering of the entries+    /// in self.entries.+    ///+    /// + `indices_to_update`: An iterator of indices into self.indices+    ///   indicating the slots where a corresponding entry in self.entries has+    ///   changed position. If the iterator returns an index pointing to an+    ///   empty slot in self.indices, that slot will be skipped and not updated.+    /// + `new_positions_to_store`: An iterator of indices into self.entries+    ///   indicating the new positions to which the referenced slots in+    ///   self.indices should point after the update. Must contain the same+    ///   number of items as indices_to_update.+    fn update_indices<I, J>(&mut self, indices_to_update: I, new_positions_to_store: J)+        where I: Iterator<Item = usize>,+              J: Iterator<Item = usize>+    {+        dispatch_32_vs_64!(self.update_indices_impl::<I, J>(indices_to_update,+            new_positions_to_store))+    }++    fn update_indices_impl<Sz, I, J>(&mut self, indices_to_update: I, new_positions_to_store: J)+        where Sz: Size,+              I: Iterator<Item = usize>,+              J: Iterator<Item = usize>+    {+        for (index_to_update, new_position_to_store) in indices_to_update.zip(new_positions_to_store) {+            let pos = &mut self.indices[index_to_update];+            if !pos.is_none() {+                pos.set_pos::<Sz>(new_position_to_store);+            }+        }+    }++    /// Probe self.indices to find the slot that points to the specified+    /// position in self.entries.+    ///+    /// Return an index into self.indices indicating the location of the+    /// matching slot that was found.+    ///+    /// + `index`: The position value to look for in self.indices. This method+    ///   will probe until it finds a slot in self.indices that points to this+    ///   position in self.entries. Must be a valid index for self.entries.+    fn probe_to_referenced_pos(&self, index: usize) -> usize {+        dispatch_32_vs_64!(self.probe_to_referenced_pos_impl(index))+    }++    fn probe_to_referenced_pos_impl<Sz>(&self, index: usize) -> usize+        where Sz: Size+    {+        let mut probe = desired_pos(self.mask, self.entries[index].hash);+        probe_loop!(probe < self.indices.len(), {+            if let Some((stored_index, _)) = self.indices[probe].resolve::<Sz>() {+                if stored_index == index {+                    return probe;+                }+            }+        })+    }++    /// Move an entry from its current index in the map's ordering to a+    /// different index, while maintaining the order of all other entries.+    ///+    /// + `curr_index`: Numerical index indicating the current position of the+    ///   entry to be moved, within the map's ordering. Must be a valid index+    ///   for the map.+    /// + `new_index`: Numerical index indicating the new position to which the+    ///   entry should be moved, within the map's ordering. Must be a valid+    ///   index for the map.+    fn reorder_entry_found(&mut self, curr_index: usize, new_index: usize) {+        if curr_index == new_index {+            return;+        }++        // Compute information about the indices at which we are moving entries.+        let moving_to_greater = new_index > curr_index;+        let (low_index, high_index) = {+            if moving_to_greater {+                (curr_index, new_index)+            } else {+                (new_index, curr_index)+            }+        };++        // Determine what indices in self.indices will need to be updated after+        // the reordering. This is done before the reordering of self.entries so+        // that the probing will start at the right hash values.+        let indices_to_update : Vec<usize> = (low_index..(high_index + 1))+            .map(|index| self.probe_to_referenced_pos(index)).collect();++        // Change the order in self.entries.+        {+            let slice = &mut self.entries[low_index..(high_index + 1)];+            if moving_to_greater {+                slice.rotate_left(1);+            } else {+                slice.rotate_right(1);+            }+        }++        // Update the relevant elements of self.indices to match the new order.+        if moving_to_greater {+            let new_positions_to_store = once(high_index)+                .chain(low_index..high_index);+            self.update_indices(indices_to_update.into_iter(),

into_iter called on something that's already an iterator

allen-marshall

comment created time in a month

Pull request review commentbluss/indexmap

Switch to hashbrown's RawTable internally

 impl<K, V> Entries for IndexMapCore<K, V> {     where         F: FnOnce(&mut [Self::Entry]),     {-        let side_index = self.save_hash_index();         f(&mut self.entries);-        self.restore_hash_index(side_index);+        self.rebuild_hash_table();     } }  impl<K, V> IndexMapCore<K, V> {     #[inline]     pub(crate) fn new() -> Self {         IndexMapCore {-            mask: 0,-            indices: Box::new([]),+            indices: RawTable::new(),             entries: Vec::new(),         }     }      #[inline]     pub(crate) fn with_capacity(n: usize) -> Self {-        let raw = to_raw_capacity(n);-        let raw_cap = max(raw.next_power_of_two(), 8);         IndexMapCore {-            mask: raw_cap.wrapping_sub(1),-            indices: vec![Pos::none(); raw_cap].into_boxed_slice(),-            entries: Vec::with_capacity(usable_capacity(raw_cap)),+            indices: RawTable::with_capacity(n),+            entries: Vec::with_capacity(n),         }     } -    // Return whether we need 32 or 64 bits to specify a bucket or entry index-    #[cfg(not(feature = "test_low_transition_point"))]-    fn size_class_is_64bit(&self) -> bool {-        usize::max_value() > u32::max_value() as usize-            && self.raw_capacity() >= u32::max_value() as usize-    }--    // for testing-    #[cfg(feature = "test_low_transition_point")]-    fn size_class_is_64bit(&self) -> bool {-        self.raw_capacity() >= 64-    }--    #[inline(always)]-    fn raw_capacity(&self) -> usize {-        self.indices.len()-    }-+    #[inline]     pub(crate) fn len(&self) -> usize {-        self.entries.len()+        self.indices.len()     } +    #[inline]     pub(crate) fn capacity(&self) -> usize {-        usable_capacity(self.raw_capacity())+        cmp::min(self.indices.capacity(), self.entries.capacity())     }      pub(crate) fn clear(&mut self) {+        self.indices.clear_no_drop();         self.entries.clear();-        self.clear_indices();     }      pub(crate) fn drain(&mut self, range: RangeFull) -> Drain<Bucket<K, V>> {-        self.clear_indices();+        self.indices.clear_no_drop();         self.entries.drain(range)     } -    // clear self.indices to the same state as "no elements"-    fn clear_indices(&mut self) {-        for pos in self.indices.iter_mut() {-            *pos = Pos::none();-        }+    /// Reserve capacity for `additional` more key-value pairs.+    pub(crate) fn reserve(&mut self, additional: usize) {+        self.indices.reserve(additional, get_hash(&self.entries));+        self.reserve_entries();     } -    fn first_allocation(&mut self) {-        debug_assert_eq!(self.len(), 0);-        let raw_cap = 8usize;-        self.mask = raw_cap.wrapping_sub(1);-        self.indices = vec![Pos::none(); raw_cap].into_boxed_slice();-        self.entries = Vec::with_capacity(usable_capacity(raw_cap));+    /// Reserve entries capacity to match the indices+    fn reserve_entries(&mut self) {+        let additional = self.indices.capacity() - self.entries.len();+        self.entries.reserve_exact(additional);     } -    pub(crate) fn reserve_one(&mut self) {-        if self.len() == self.capacity() {-            dispatch_32_vs_64!(self.double_capacity());-        }+    /// Shrink the capacity of the map as much as possible.+    pub(crate) fn shrink_to_fit(&mut self) {+        self.indices.shrink_to(0, get_hash(&self.entries));+        self.entries.shrink_to_fit();     } -    #[inline(never)]-    // `Sz` is *current* Size class, before grow-    fn double_capacity<Sz>(&mut self)-    where-        Sz: Size,-    {-        debug_assert!(self.raw_capacity() == 0 || self.len() > 0);-        if self.raw_capacity() == 0 {-            return self.first_allocation();-        }--        // find first ideally placed element -- start of cluster-        let mut first_ideal = 0;-        for (i, index) in enumerate(&*self.indices) {-            if let Some(pos) = index.pos() {-                if 0 == probe_distance(self.mask, self.entries[pos].hash, i) {-                    first_ideal = i;-                    break;-                }-            }-        }--        // visit the entries in an order where we can simply reinsert them-        // into self.indices without any bucket stealing.-        let new_raw_cap = self.indices.len() * 2;-        let old_indices = replace(-            &mut self.indices,-            vec![Pos::none(); new_raw_cap].into_boxed_slice(),-        );-        self.mask = new_raw_cap.wrapping_sub(1);--        // `Sz` is the old size class, and either u32 or u64 is the new-        for &pos in &old_indices[first_ideal..] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));-        }--        for &pos in &old_indices[..first_ideal] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));+    /// Remove the last key-value pair+    pub(crate) fn pop(&mut self) -> Option<(K, V)> {+        if let Some(entry) = self.entries.pop() {+            let last = self.entries.len();+            let raw_bucket = self.find_index(entry.hash, last).unwrap();+            unsafe { self.indices.erase_no_drop(&raw_bucket) };+            Some((entry.key, entry.value))+        } else {+            None         }-        let more = self.capacity() - self.len();-        self.entries.reserve_exact(more);     } -    // write to self.indices-    // read from self.entries at `pos`-    //-    // reinserting rewrites all `Pos` entries anyway. This handles transitioning-    // from u32 to u64 size class if needed by using the two type parameters.-    fn reinsert_entry_in_order<SzNew, SzOld>(&mut self, pos: Pos)-    where-        SzNew: Size,-        SzOld: Size,-    {-        if let Some((i, hash_proxy)) = pos.resolve::<SzOld>() {-            // only if the size class is conserved can we use the short hash-            let entry_hash = if SzOld::is_same_size::<SzNew>() {-                hash_proxy.get_short_hash(&self.entries, i).into_hash()-            } else {-                self.entries[i].hash-            };-            // find first empty bucket and insert there-            let mut probe = desired_pos(self.mask, entry_hash);-            probe_loop!(probe < self.indices.len(), {-                if self.indices[probe].is_none() {-                    // empty bucket, insert here-                    self.indices[probe] = Pos::with_hash::<SzNew>(i, entry_hash);-                    return;-                }-            });+    /// Append a key-value pair, *without* checking whether it already exists.+    fn push(&mut self, hash: HashValue, key: K, value: V) -> usize {+        let i = self.entries.len();+        self.indices.insert(hash.get(), i, get_hash(&self.entries));+        if i == self.entries.capacity() {+            // Reserve our own capacity synced to the indices,+            // rather than letting `Vec::push` just double it.+            self.reserve_entries();         }+        self.entries.push(Bucket { hash, key, value });+        i     } -    pub(crate) fn pop_impl(&mut self) -> Option<(K, V)> {-        let (probe, found) = match self.as_entries().last() {-            Some(e) => self.find_existing_entry(e),-            None => return None,-        };-        debug_assert_eq!(found, self.entries.len() - 1);-        Some(self.swap_remove_found(probe, found))-    }--    fn insert_phase_1<'a, Sz, A>(&'a mut self, hash: HashValue, key: K, action: A) -> A::Output+    /// Return the index in `entries` where an equivalent key can be found+    pub(crate) fn get_index_of<Q>(&self, hash: HashValue, key: &Q) -> Option<usize>     where-        Sz: Size,-        K: Eq,-        A: ProbeAction<'a, Sz, K, V>,-    {-        let mut probe = desired_pos(self.mask, hash);-        let mut dist = 0;-        debug_assert!(self.len() < self.raw_capacity());-        probe_loop!(probe < self.indices.len(), {-            if let Some((i, hash_proxy)) = self.indices[probe].resolve::<Sz>() {-                let entry_hash = hash_proxy.get_short_hash(&self.entries, i);-                // if existing element probed less than us, swap-                let their_dist = probe_distance(self.mask, entry_hash.into_hash(), probe);-                if their_dist < dist {-                    // robin hood: steal the spot if it's better for us-                    return action.steal(VacantEntry {-                        map: self,-                        hash: hash,-                        key: key,-                        probe: probe,-                    });-                } else if entry_hash == hash && self.entries[i].key == key {-                    return action.hit(OccupiedEntry {-                        map: self,-                        key: key,-                        probe: probe,-                        index: i,-                    });-                }-            } else {-                // empty bucket, insert here-                return action.empty(VacantEntry {-                    map: self,-                    hash: hash,-                    key: key,-                    probe: probe,-                });-            }-            dist += 1;-        });-    }--    /// phase 2 is post-insert where we forward-shift `Pos` in the indices.-    fn insert_phase_2<Sz>(&mut self, mut probe: usize, mut old_pos: Pos)-    where-        Sz: Size,+        Q: ?Sized + Equivalent<K>,     {-        probe_loop!(probe < self.indices.len(), {-            let pos = &mut self.indices[probe];-            if pos.is_none() {-                *pos = old_pos;-                break;-            } else {-                old_pos = replace(pos, old_pos);-            }-        });+        match self.find_equivalent(hash, key) {+            Some(raw_bucket) => Some(unsafe { raw_bucket.read() }),+            None => None,+        }     }      pub(crate) fn insert_full(&mut self, hash: HashValue, key: K, value: V) -> (usize, Option<V>)     where         K: Eq,     {-        self.reserve_one();-        dispatch_32_vs_64!(self.insert_phase_1::<_>(hash, key, InsertValue(value)))+        match self.get_index_of(hash, &key) {+            Some(i) => (i, Some(replace(&mut self.entries[i].value, value))),+            None => (self.push(hash, key, value), None),+        }     }      pub(crate) fn entry(&mut self, hash: HashValue, key: K) -> Entry<K, V>     where         K: Eq,     {-        self.reserve_one();-        dispatch_32_vs_64!(self.insert_phase_1::<_>(hash, key, MakeEntry))-    }--    /// Return probe (indices) and position (entries)-    pub(crate) fn find_using<F>(&self, hash: HashValue, key_eq: F) -> Option<(usize, usize)>+        match self.find_equivalent(hash, &key) {+            Some(raw_bucket) => Entry::Occupied(OccupiedEntry {+                map: self,+                raw_bucket,

Thanks, previously I've been wondering a lot about what cuviper quotes, the "all data reached through a shared reference or data owned by an immutable binding is immutable, unless that data is contained within an UnsafeCell<U>."

But it is clarified here that in the accessor for example RawTable::data_end(&self) -> *mut T the raw pointer is not considered to be based on the &RawTable reference.

In this quote's formulation, it has been unclear what "based on" means and I think that's just due to that it hasn't really been formally specified at all, I know it's an ongoing process. <3

cuviper

comment created time in a month

Pull request review commentbluss/indexmap

Switch to hashbrown's RawTable internally

 impl<K, V> Entries for IndexMapCore<K, V> {     where         F: FnOnce(&mut [Self::Entry]),     {-        let side_index = self.save_hash_index();         f(&mut self.entries);-        self.restore_hash_index(side_index);+        self.rebuild_hash_table();     } }  impl<K, V> IndexMapCore<K, V> {     #[inline]     pub(crate) fn new() -> Self {         IndexMapCore {-            mask: 0,-            indices: Box::new([]),+            indices: RawTable::new(),             entries: Vec::new(),         }     }      #[inline]     pub(crate) fn with_capacity(n: usize) -> Self {-        let raw = to_raw_capacity(n);-        let raw_cap = max(raw.next_power_of_two(), 8);         IndexMapCore {-            mask: raw_cap.wrapping_sub(1),-            indices: vec![Pos::none(); raw_cap].into_boxed_slice(),-            entries: Vec::with_capacity(usable_capacity(raw_cap)),+            indices: RawTable::with_capacity(n),+            entries: Vec::with_capacity(n),         }     } -    // Return whether we need 32 or 64 bits to specify a bucket or entry index-    #[cfg(not(feature = "test_low_transition_point"))]-    fn size_class_is_64bit(&self) -> bool {-        usize::max_value() > u32::max_value() as usize-            && self.raw_capacity() >= u32::max_value() as usize-    }--    // for testing-    #[cfg(feature = "test_low_transition_point")]-    fn size_class_is_64bit(&self) -> bool {-        self.raw_capacity() >= 64-    }--    #[inline(always)]-    fn raw_capacity(&self) -> usize {-        self.indices.len()-    }-+    #[inline]     pub(crate) fn len(&self) -> usize {-        self.entries.len()+        self.indices.len()     } +    #[inline]     pub(crate) fn capacity(&self) -> usize {-        usable_capacity(self.raw_capacity())+        cmp::min(self.indices.capacity(), self.entries.capacity())     }      pub(crate) fn clear(&mut self) {+        self.indices.clear_no_drop();         self.entries.clear();-        self.clear_indices();     }      pub(crate) fn drain(&mut self, range: RangeFull) -> Drain<Bucket<K, V>> {-        self.clear_indices();+        self.indices.clear_no_drop();         self.entries.drain(range)     } -    // clear self.indices to the same state as "no elements"-    fn clear_indices(&mut self) {-        for pos in self.indices.iter_mut() {-            *pos = Pos::none();-        }+    /// Reserve capacity for `additional` more key-value pairs.+    pub(crate) fn reserve(&mut self, additional: usize) {+        self.indices.reserve(additional, get_hash(&self.entries));+        self.reserve_entries();     } -    fn first_allocation(&mut self) {-        debug_assert_eq!(self.len(), 0);-        let raw_cap = 8usize;-        self.mask = raw_cap.wrapping_sub(1);-        self.indices = vec![Pos::none(); raw_cap].into_boxed_slice();-        self.entries = Vec::with_capacity(usable_capacity(raw_cap));+    /// Reserve entries capacity to match the indices+    fn reserve_entries(&mut self) {+        let additional = self.indices.capacity() - self.entries.len();+        self.entries.reserve_exact(additional);     } -    pub(crate) fn reserve_one(&mut self) {-        if self.len() == self.capacity() {-            dispatch_32_vs_64!(self.double_capacity());-        }+    /// Shrink the capacity of the map as much as possible.+    pub(crate) fn shrink_to_fit(&mut self) {+        self.indices.shrink_to(0, get_hash(&self.entries));+        self.entries.shrink_to_fit();     } -    #[inline(never)]-    // `Sz` is *current* Size class, before grow-    fn double_capacity<Sz>(&mut self)-    where-        Sz: Size,-    {-        debug_assert!(self.raw_capacity() == 0 || self.len() > 0);-        if self.raw_capacity() == 0 {-            return self.first_allocation();-        }--        // find first ideally placed element -- start of cluster-        let mut first_ideal = 0;-        for (i, index) in enumerate(&*self.indices) {-            if let Some(pos) = index.pos() {-                if 0 == probe_distance(self.mask, self.entries[pos].hash, i) {-                    first_ideal = i;-                    break;-                }-            }-        }--        // visit the entries in an order where we can simply reinsert them-        // into self.indices without any bucket stealing.-        let new_raw_cap = self.indices.len() * 2;-        let old_indices = replace(-            &mut self.indices,-            vec![Pos::none(); new_raw_cap].into_boxed_slice(),-        );-        self.mask = new_raw_cap.wrapping_sub(1);--        // `Sz` is the old size class, and either u32 or u64 is the new-        for &pos in &old_indices[first_ideal..] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));-        }--        for &pos in &old_indices[..first_ideal] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));+    /// Remove the last key-value pair+    pub(crate) fn pop(&mut self) -> Option<(K, V)> {+        if let Some(entry) = self.entries.pop() {+            let last = self.entries.len();+            let raw_bucket = self.find_index(entry.hash, last).unwrap();+            unsafe { self.indices.erase_no_drop(&raw_bucket) };+            Some((entry.key, entry.value))+        } else {+            None         }-        let more = self.capacity() - self.len();-        self.entries.reserve_exact(more);     } -    // write to self.indices-    // read from self.entries at `pos`-    //-    // reinserting rewrites all `Pos` entries anyway. This handles transitioning-    // from u32 to u64 size class if needed by using the two type parameters.-    fn reinsert_entry_in_order<SzNew, SzOld>(&mut self, pos: Pos)-    where-        SzNew: Size,-        SzOld: Size,-    {-        if let Some((i, hash_proxy)) = pos.resolve::<SzOld>() {-            // only if the size class is conserved can we use the short hash-            let entry_hash = if SzOld::is_same_size::<SzNew>() {-                hash_proxy.get_short_hash(&self.entries, i).into_hash()-            } else {-                self.entries[i].hash-            };-            // find first empty bucket and insert there-            let mut probe = desired_pos(self.mask, entry_hash);-            probe_loop!(probe < self.indices.len(), {-                if self.indices[probe].is_none() {-                    // empty bucket, insert here-                    self.indices[probe] = Pos::with_hash::<SzNew>(i, entry_hash);-                    return;-                }-            });+    /// Append a key-value pair, *without* checking whether it already exists.+    fn push(&mut self, hash: HashValue, key: K, value: V) -> usize {+        let i = self.entries.len();+        self.indices.insert(hash.get(), i, get_hash(&self.entries));+        if i == self.entries.capacity() {+            // Reserve our own capacity synced to the indices,+            // rather than letting `Vec::push` just double it.+            self.reserve_entries();         }+        self.entries.push(Bucket { hash, key, value });+        i     } -    pub(crate) fn pop_impl(&mut self) -> Option<(K, V)> {-        let (probe, found) = match self.as_entries().last() {-            Some(e) => self.find_existing_entry(e),-            None => return None,-        };-        debug_assert_eq!(found, self.entries.len() - 1);-        Some(self.swap_remove_found(probe, found))-    }--    fn insert_phase_1<'a, Sz, A>(&'a mut self, hash: HashValue, key: K, action: A) -> A::Output+    /// Return the index in `entries` where an equivalent key can be found+    pub(crate) fn get_index_of<Q>(&self, hash: HashValue, key: &Q) -> Option<usize>     where-        Sz: Size,-        K: Eq,-        A: ProbeAction<'a, Sz, K, V>,-    {-        let mut probe = desired_pos(self.mask, hash);-        let mut dist = 0;-        debug_assert!(self.len() < self.raw_capacity());-        probe_loop!(probe < self.indices.len(), {-            if let Some((i, hash_proxy)) = self.indices[probe].resolve::<Sz>() {-                let entry_hash = hash_proxy.get_short_hash(&self.entries, i);-                // if existing element probed less than us, swap-                let their_dist = probe_distance(self.mask, entry_hash.into_hash(), probe);-                if their_dist < dist {-                    // robin hood: steal the spot if it's better for us-                    return action.steal(VacantEntry {-                        map: self,-                        hash: hash,-                        key: key,-                        probe: probe,-                    });-                } else if entry_hash == hash && self.entries[i].key == key {-                    return action.hit(OccupiedEntry {-                        map: self,-                        key: key,-                        probe: probe,-                        index: i,-                    });-                }-            } else {-                // empty bucket, insert here-                return action.empty(VacantEntry {-                    map: self,-                    hash: hash,-                    key: key,-                    probe: probe,-                });-            }-            dist += 1;-        });-    }--    /// phase 2 is post-insert where we forward-shift `Pos` in the indices.-    fn insert_phase_2<Sz>(&mut self, mut probe: usize, mut old_pos: Pos)-    where-        Sz: Size,+        Q: ?Sized + Equivalent<K>,     {-        probe_loop!(probe < self.indices.len(), {-            let pos = &mut self.indices[probe];-            if pos.is_none() {-                *pos = old_pos;-                break;-            } else {-                old_pos = replace(pos, old_pos);-            }-        });+        match self.find_equivalent(hash, key) {+            Some(raw_bucket) => Some(unsafe { raw_bucket.read() }),+            None => None,+        }     }      pub(crate) fn insert_full(&mut self, hash: HashValue, key: K, value: V) -> (usize, Option<V>)     where         K: Eq,     {-        self.reserve_one();-        dispatch_32_vs_64!(self.insert_phase_1::<_>(hash, key, InsertValue(value)))+        match self.get_index_of(hash, &key) {+            Some(i) => (i, Some(replace(&mut self.entries[i].value, value))),+            None => (self.push(hash, key, value), None),+        }     }      pub(crate) fn entry(&mut self, hash: HashValue, key: K) -> Entry<K, V>     where         K: Eq,     {-        self.reserve_one();-        dispatch_32_vs_64!(self.insert_phase_1::<_>(hash, key, MakeEntry))-    }--    /// Return probe (indices) and position (entries)-    pub(crate) fn find_using<F>(&self, hash: HashValue, key_eq: F) -> Option<(usize, usize)>+        match self.find_equivalent(hash, &key) {+            Some(raw_bucket) => Entry::Occupied(OccupiedEntry {+                map: self,+                raw_bucket,

@RalfJung thanks for commenting. You address the initial raw borrow, but in this case the life of the raw pointer, just comes from the allocator I think? (Just like in Vec, technically)? In this example, there is never a case where we consider the data pointers to be derived from the &RawTable or &mut RawTable references at all.

This is the critical piece I think I'm missing - when do we talk about about raw pointers being based on a &/&mut borrow at all?

It seems like in this case, we get a raw pointer from the allocator and we copy the pointer along (we can use an accessor on &RawTable to copy the raw pointer), and it is considered valid to write through this raw pointer all the way through?

cuviper

comment created time in a month

Pull request review commentbluss/indexmap

Switch to hashbrown's RawTable internally

 impl<K, V> Entries for IndexMapCore<K, V> {     where         F: FnOnce(&mut [Self::Entry]),     {-        let side_index = self.save_hash_index();         f(&mut self.entries);-        self.restore_hash_index(side_index);+        self.rebuild_hash_table();     } }  impl<K, V> IndexMapCore<K, V> {     #[inline]     pub(crate) fn new() -> Self {         IndexMapCore {-            mask: 0,-            indices: Box::new([]),+            indices: RawTable::new(),             entries: Vec::new(),         }     }      #[inline]     pub(crate) fn with_capacity(n: usize) -> Self {-        let raw = to_raw_capacity(n);-        let raw_cap = max(raw.next_power_of_two(), 8);         IndexMapCore {-            mask: raw_cap.wrapping_sub(1),-            indices: vec![Pos::none(); raw_cap].into_boxed_slice(),-            entries: Vec::with_capacity(usable_capacity(raw_cap)),+            indices: RawTable::with_capacity(n),+            entries: Vec::with_capacity(n),         }     } -    // Return whether we need 32 or 64 bits to specify a bucket or entry index-    #[cfg(not(feature = "test_low_transition_point"))]-    fn size_class_is_64bit(&self) -> bool {-        usize::max_value() > u32::max_value() as usize-            && self.raw_capacity() >= u32::max_value() as usize-    }--    // for testing-    #[cfg(feature = "test_low_transition_point")]-    fn size_class_is_64bit(&self) -> bool {-        self.raw_capacity() >= 64-    }--    #[inline(always)]-    fn raw_capacity(&self) -> usize {-        self.indices.len()-    }-+    #[inline]     pub(crate) fn len(&self) -> usize {-        self.entries.len()+        self.indices.len()     } +    #[inline]     pub(crate) fn capacity(&self) -> usize {-        usable_capacity(self.raw_capacity())+        cmp::min(self.indices.capacity(), self.entries.capacity())     }      pub(crate) fn clear(&mut self) {+        self.indices.clear_no_drop();         self.entries.clear();-        self.clear_indices();     }      pub(crate) fn drain(&mut self, range: RangeFull) -> Drain<Bucket<K, V>> {-        self.clear_indices();+        self.indices.clear_no_drop();         self.entries.drain(range)     } -    // clear self.indices to the same state as "no elements"-    fn clear_indices(&mut self) {-        for pos in self.indices.iter_mut() {-            *pos = Pos::none();-        }+    /// Reserve capacity for `additional` more key-value pairs.+    pub(crate) fn reserve(&mut self, additional: usize) {+        self.indices.reserve(additional, get_hash(&self.entries));+        self.reserve_entries();     } -    fn first_allocation(&mut self) {-        debug_assert_eq!(self.len(), 0);-        let raw_cap = 8usize;-        self.mask = raw_cap.wrapping_sub(1);-        self.indices = vec![Pos::none(); raw_cap].into_boxed_slice();-        self.entries = Vec::with_capacity(usable_capacity(raw_cap));+    /// Reserve entries capacity to match the indices+    fn reserve_entries(&mut self) {+        let additional = self.indices.capacity() - self.entries.len();+        self.entries.reserve_exact(additional);     } -    pub(crate) fn reserve_one(&mut self) {-        if self.len() == self.capacity() {-            dispatch_32_vs_64!(self.double_capacity());-        }+    /// Shrink the capacity of the map as much as possible.+    pub(crate) fn shrink_to_fit(&mut self) {+        self.indices.shrink_to(0, get_hash(&self.entries));+        self.entries.shrink_to_fit();     } -    #[inline(never)]-    // `Sz` is *current* Size class, before grow-    fn double_capacity<Sz>(&mut self)-    where-        Sz: Size,-    {-        debug_assert!(self.raw_capacity() == 0 || self.len() > 0);-        if self.raw_capacity() == 0 {-            return self.first_allocation();-        }--        // find first ideally placed element -- start of cluster-        let mut first_ideal = 0;-        for (i, index) in enumerate(&*self.indices) {-            if let Some(pos) = index.pos() {-                if 0 == probe_distance(self.mask, self.entries[pos].hash, i) {-                    first_ideal = i;-                    break;-                }-            }-        }--        // visit the entries in an order where we can simply reinsert them-        // into self.indices without any bucket stealing.-        let new_raw_cap = self.indices.len() * 2;-        let old_indices = replace(-            &mut self.indices,-            vec![Pos::none(); new_raw_cap].into_boxed_slice(),-        );-        self.mask = new_raw_cap.wrapping_sub(1);--        // `Sz` is the old size class, and either u32 or u64 is the new-        for &pos in &old_indices[first_ideal..] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));-        }--        for &pos in &old_indices[..first_ideal] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));+    /// Remove the last key-value pair+    pub(crate) fn pop(&mut self) -> Option<(K, V)> {+        if let Some(entry) = self.entries.pop() {+            let last = self.entries.len();+            let raw_bucket = self.find_index(entry.hash, last).unwrap();+            unsafe { self.indices.erase_no_drop(&raw_bucket) };+            Some((entry.key, entry.value))+        } else {+            None         }-        let more = self.capacity() - self.len();-        self.entries.reserve_exact(more);     } -    // write to self.indices-    // read from self.entries at `pos`-    //-    // reinserting rewrites all `Pos` entries anyway. This handles transitioning-    // from u32 to u64 size class if needed by using the two type parameters.-    fn reinsert_entry_in_order<SzNew, SzOld>(&mut self, pos: Pos)-    where-        SzNew: Size,-        SzOld: Size,-    {-        if let Some((i, hash_proxy)) = pos.resolve::<SzOld>() {-            // only if the size class is conserved can we use the short hash-            let entry_hash = if SzOld::is_same_size::<SzNew>() {-                hash_proxy.get_short_hash(&self.entries, i).into_hash()-            } else {-                self.entries[i].hash-            };-            // find first empty bucket and insert there-            let mut probe = desired_pos(self.mask, entry_hash);-            probe_loop!(probe < self.indices.len(), {-                if self.indices[probe].is_none() {-                    // empty bucket, insert here-                    self.indices[probe] = Pos::with_hash::<SzNew>(i, entry_hash);-                    return;-                }-            });+    /// Append a key-value pair, *without* checking whether it already exists.+    fn push(&mut self, hash: HashValue, key: K, value: V) -> usize {+        let i = self.entries.len();+        self.indices.insert(hash.get(), i, get_hash(&self.entries));+        if i == self.entries.capacity() {+            // Reserve our own capacity synced to the indices,+            // rather than letting `Vec::push` just double it.+            self.reserve_entries();         }+        self.entries.push(Bucket { hash, key, value });+        i     } -    pub(crate) fn pop_impl(&mut self) -> Option<(K, V)> {-        let (probe, found) = match self.as_entries().last() {-            Some(e) => self.find_existing_entry(e),-            None => return None,-        };-        debug_assert_eq!(found, self.entries.len() - 1);-        Some(self.swap_remove_found(probe, found))-    }--    fn insert_phase_1<'a, Sz, A>(&'a mut self, hash: HashValue, key: K, action: A) -> A::Output+    /// Return the index in `entries` where an equivalent key can be found+    pub(crate) fn get_index_of<Q>(&self, hash: HashValue, key: &Q) -> Option<usize>     where-        Sz: Size,-        K: Eq,-        A: ProbeAction<'a, Sz, K, V>,-    {-        let mut probe = desired_pos(self.mask, hash);-        let mut dist = 0;-        debug_assert!(self.len() < self.raw_capacity());-        probe_loop!(probe < self.indices.len(), {-            if let Some((i, hash_proxy)) = self.indices[probe].resolve::<Sz>() {-                let entry_hash = hash_proxy.get_short_hash(&self.entries, i);-                // if existing element probed less than us, swap-                let their_dist = probe_distance(self.mask, entry_hash.into_hash(), probe);-                if their_dist < dist {-                    // robin hood: steal the spot if it's better for us-                    return action.steal(VacantEntry {-                        map: self,-                        hash: hash,-                        key: key,-                        probe: probe,-                    });-                } else if entry_hash == hash && self.entries[i].key == key {-                    return action.hit(OccupiedEntry {-                        map: self,-                        key: key,-                        probe: probe,-                        index: i,-                    });-                }-            } else {-                // empty bucket, insert here-                return action.empty(VacantEntry {-                    map: self,-                    hash: hash,-                    key: key,-                    probe: probe,-                });-            }-            dist += 1;-        });-    }--    /// phase 2 is post-insert where we forward-shift `Pos` in the indices.-    fn insert_phase_2<Sz>(&mut self, mut probe: usize, mut old_pos: Pos)-    where-        Sz: Size,+        Q: ?Sized + Equivalent<K>,     {-        probe_loop!(probe < self.indices.len(), {-            let pos = &mut self.indices[probe];-            if pos.is_none() {-                *pos = old_pos;-                break;-            } else {-                old_pos = replace(pos, old_pos);-            }-        });+        match self.find_equivalent(hash, key) {+            Some(raw_bucket) => Some(unsafe { raw_bucket.read() }),+            None => None,+        }     }      pub(crate) fn insert_full(&mut self, hash: HashValue, key: K, value: V) -> (usize, Option<V>)     where         K: Eq,     {-        self.reserve_one();-        dispatch_32_vs_64!(self.insert_phase_1::<_>(hash, key, InsertValue(value)))+        match self.get_index_of(hash, &key) {+            Some(i) => (i, Some(replace(&mut self.entries[i].value, value))),+            None => (self.push(hash, key, value), None),+        }     }      pub(crate) fn entry(&mut self, hash: HashValue, key: K) -> Entry<K, V>     where         K: Eq,     {-        self.reserve_one();-        dispatch_32_vs_64!(self.insert_phase_1::<_>(hash, key, MakeEntry))-    }--    /// Return probe (indices) and position (entries)-    pub(crate) fn find_using<F>(&self, hash: HashValue, key_eq: F) -> Option<(usize, usize)>+        match self.find_equivalent(hash, &key) {+            Some(raw_bucket) => Entry::Occupied(OccupiedEntry {+                map: self,+                raw_bucket,

This is just that there is a lot happening in unsafe coding guidelines. Stacked borrows model, rules around raw pointers and pointer provenance. In some cases, a lot more things are forbidden than I thought were with raw pointers. Miri doesn't check raw pointers very well, I think.

I'm sorry, I just don't have a model that I can follow here. That the API forces it is unfortunately not an argument, the other arguments are appreciated 😄 With this code, I feel unmoored and don't know how to think about it - every argument about validity needs to start with the implementation details of RawTable (I can read those), and what do the &mut self and &self signatures on the methods then even mean anymore?

I'll go incoherent I fear. On a slice it's important that we have as_mut_ptr(&mut self) because we cast &mut [T] as *mut [T] as *mut T. On a Vec<T> is it important we modify through as_mut_ptr? Why is there a story that modification through Vec::as_ptr -> *const T is UB? RawTable::data_start(&self) is the same kind of accessor to me.

cuviper

comment created time in 2 months

Pull request review commentbluss/indexmap

Switch to hashbrown's RawTable internally

 autocfg = "1" serde = { version = "1.0", optional = true, default-features = false } rayon = { version = "1.0", optional = true } +[dependencies.hashbrown]+version = "0.8"+default-features = false+features = ["inline-more", "raw"]

Great, let's drop it, Alex Crichton delivered some good rationale for avoiding this feature anyway, no reason not to follow this decision (see hashbrown PR history).

cuviper

comment created time in 2 months

startedOpenDiablo2/OpenDiablo2

started time in 2 months

pull request commentbluss/indexmap

Add FnvIndexMap, FnvIndexSet

FxHasher and others are more interesting than this hasher, I'd lean towards declining this PR because it is a mere convenience, and it might not be worth the cost of a public dependency, even if that one is stable.

konsumlamm

comment created time in 2 months

issue commentRazrFalcon/pico-args

Forbid flags between options.

Interesting library. On this issue, shouldn't "-h" be the value to "-w" ? A random example would be "git grep -e -e" where the second -e ends up being the value to the first (and also the answer to, what do you do if you want to search by a string that starts with a dash?).

RazrFalcon

comment created time in 2 months

startedRazrFalcon/pico-args

started time in 2 months

startedjonathan-laurent/AlphaZero.jl

started time in 2 months

Pull request review commentbluss/indexmap

Switch to hashbrown's RawTable internally

 impl<K, V> Entries for IndexMapCore<K, V> {     where         F: FnOnce(&mut [Self::Entry]),     {-        let side_index = self.save_hash_index();         f(&mut self.entries);-        self.restore_hash_index(side_index);+        self.rebuild_hash_table();     } }  impl<K, V> IndexMapCore<K, V> {     #[inline]     pub(crate) fn new() -> Self {         IndexMapCore {-            mask: 0,-            indices: Box::new([]),+            indices: RawTable::new(),             entries: Vec::new(),         }     }      #[inline]     pub(crate) fn with_capacity(n: usize) -> Self {-        let raw = to_raw_capacity(n);-        let raw_cap = max(raw.next_power_of_two(), 8);         IndexMapCore {-            mask: raw_cap.wrapping_sub(1),-            indices: vec![Pos::none(); raw_cap].into_boxed_slice(),-            entries: Vec::with_capacity(usable_capacity(raw_cap)),+            indices: RawTable::with_capacity(n),+            entries: Vec::with_capacity(n),         }     } -    // Return whether we need 32 or 64 bits to specify a bucket or entry index-    #[cfg(not(feature = "test_low_transition_point"))]-    fn size_class_is_64bit(&self) -> bool {-        usize::max_value() > u32::max_value() as usize-            && self.raw_capacity() >= u32::max_value() as usize-    }--    // for testing-    #[cfg(feature = "test_low_transition_point")]-    fn size_class_is_64bit(&self) -> bool {-        self.raw_capacity() >= 64-    }--    #[inline(always)]-    fn raw_capacity(&self) -> usize {-        self.indices.len()-    }-+    #[inline]     pub(crate) fn len(&self) -> usize {-        self.entries.len()+        self.indices.len()     } +    #[inline]     pub(crate) fn capacity(&self) -> usize {-        usable_capacity(self.raw_capacity())+        cmp::min(self.indices.capacity(), self.entries.capacity())     }      pub(crate) fn clear(&mut self) {+        self.indices.clear_no_drop();         self.entries.clear();-        self.clear_indices();     }      pub(crate) fn drain(&mut self, range: RangeFull) -> Drain<Bucket<K, V>> {-        self.clear_indices();+        self.indices.clear_no_drop();         self.entries.drain(range)     } -    // clear self.indices to the same state as "no elements"-    fn clear_indices(&mut self) {-        for pos in self.indices.iter_mut() {-            *pos = Pos::none();-        }+    /// Reserve capacity for `additional` more key-value pairs.+    pub(crate) fn reserve(&mut self, additional: usize) {+        self.indices.reserve(additional, get_hash(&self.entries));+        self.reserve_entries();     } -    fn first_allocation(&mut self) {-        debug_assert_eq!(self.len(), 0);-        let raw_cap = 8usize;-        self.mask = raw_cap.wrapping_sub(1);-        self.indices = vec![Pos::none(); raw_cap].into_boxed_slice();-        self.entries = Vec::with_capacity(usable_capacity(raw_cap));+    /// Reserve entries capacity to match the indices+    fn reserve_entries(&mut self) {+        let additional = self.indices.capacity() - self.entries.len();+        self.entries.reserve_exact(additional);     } -    pub(crate) fn reserve_one(&mut self) {-        if self.len() == self.capacity() {-            dispatch_32_vs_64!(self.double_capacity());-        }+    /// Shrink the capacity of the map as much as possible.+    pub(crate) fn shrink_to_fit(&mut self) {+        self.indices.shrink_to(0, get_hash(&self.entries));+        self.entries.shrink_to_fit();     } -    #[inline(never)]-    // `Sz` is *current* Size class, before grow-    fn double_capacity<Sz>(&mut self)-    where-        Sz: Size,-    {-        debug_assert!(self.raw_capacity() == 0 || self.len() > 0);-        if self.raw_capacity() == 0 {-            return self.first_allocation();-        }--        // find first ideally placed element -- start of cluster-        let mut first_ideal = 0;-        for (i, index) in enumerate(&*self.indices) {-            if let Some(pos) = index.pos() {-                if 0 == probe_distance(self.mask, self.entries[pos].hash, i) {-                    first_ideal = i;-                    break;-                }-            }-        }--        // visit the entries in an order where we can simply reinsert them-        // into self.indices without any bucket stealing.-        let new_raw_cap = self.indices.len() * 2;-        let old_indices = replace(-            &mut self.indices,-            vec![Pos::none(); new_raw_cap].into_boxed_slice(),-        );-        self.mask = new_raw_cap.wrapping_sub(1);--        // `Sz` is the old size class, and either u32 or u64 is the new-        for &pos in &old_indices[first_ideal..] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));-        }--        for &pos in &old_indices[..first_ideal] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));+    /// Remove the last key-value pair+    pub(crate) fn pop(&mut self) -> Option<(K, V)> {+        if let Some(entry) = self.entries.pop() {+            let last = self.entries.len();+            let raw_bucket = self.find_index(entry.hash, last).unwrap();+            unsafe { self.indices.erase_no_drop(&raw_bucket) };+            Some((entry.key, entry.value))+        } else {+            None         }-        let more = self.capacity() - self.len();-        self.entries.reserve_exact(more);     } -    // write to self.indices-    // read from self.entries at `pos`-    //-    // reinserting rewrites all `Pos` entries anyway. This handles transitioning-    // from u32 to u64 size class if needed by using the two type parameters.-    fn reinsert_entry_in_order<SzNew, SzOld>(&mut self, pos: Pos)-    where-        SzNew: Size,-        SzOld: Size,-    {-        if let Some((i, hash_proxy)) = pos.resolve::<SzOld>() {-            // only if the size class is conserved can we use the short hash-            let entry_hash = if SzOld::is_same_size::<SzNew>() {-                hash_proxy.get_short_hash(&self.entries, i).into_hash()-            } else {-                self.entries[i].hash-            };-            // find first empty bucket and insert there-            let mut probe = desired_pos(self.mask, entry_hash);-            probe_loop!(probe < self.indices.len(), {-                if self.indices[probe].is_none() {-                    // empty bucket, insert here-                    self.indices[probe] = Pos::with_hash::<SzNew>(i, entry_hash);-                    return;-                }-            });+    /// Append a key-value pair, *without* checking whether it already exists.+    fn push(&mut self, hash: HashValue, key: K, value: V) -> usize {+        let i = self.entries.len();+        self.indices.insert(hash.get(), i, get_hash(&self.entries));+        if i == self.entries.capacity() {+            // Reserve our own capacity synced to the indices,+            // rather than letting `Vec::push` just double it.+            self.reserve_entries();         }+        self.entries.push(Bucket { hash, key, value });+        i     } -    pub(crate) fn pop_impl(&mut self) -> Option<(K, V)> {-        let (probe, found) = match self.as_entries().last() {-            Some(e) => self.find_existing_entry(e),-            None => return None,-        };-        debug_assert_eq!(found, self.entries.len() - 1);-        Some(self.swap_remove_found(probe, found))-    }--    fn insert_phase_1<'a, Sz, A>(&'a mut self, hash: HashValue, key: K, action: A) -> A::Output+    /// Return the index in `entries` where an equivalent key can be found+    pub(crate) fn get_index_of<Q>(&self, hash: HashValue, key: &Q) -> Option<usize>     where-        Sz: Size,-        K: Eq,-        A: ProbeAction<'a, Sz, K, V>,-    {-        let mut probe = desired_pos(self.mask, hash);-        let mut dist = 0;-        debug_assert!(self.len() < self.raw_capacity());-        probe_loop!(probe < self.indices.len(), {-            if let Some((i, hash_proxy)) = self.indices[probe].resolve::<Sz>() {-                let entry_hash = hash_proxy.get_short_hash(&self.entries, i);-                // if existing element probed less than us, swap-                let their_dist = probe_distance(self.mask, entry_hash.into_hash(), probe);-                if their_dist < dist {-                    // robin hood: steal the spot if it's better for us-                    return action.steal(VacantEntry {-                        map: self,-                        hash: hash,-                        key: key,-                        probe: probe,-                    });-                } else if entry_hash == hash && self.entries[i].key == key {-                    return action.hit(OccupiedEntry {-                        map: self,-                        key: key,-                        probe: probe,-                        index: i,-                    });-                }-            } else {-                // empty bucket, insert here-                return action.empty(VacantEntry {-                    map: self,-                    hash: hash,-                    key: key,-                    probe: probe,-                });-            }-            dist += 1;-        });-    }--    /// phase 2 is post-insert where we forward-shift `Pos` in the indices.-    fn insert_phase_2<Sz>(&mut self, mut probe: usize, mut old_pos: Pos)-    where-        Sz: Size,+        Q: ?Sized + Equivalent<K>,     {-        probe_loop!(probe < self.indices.len(), {-            let pos = &mut self.indices[probe];-            if pos.is_none() {-                *pos = old_pos;-                break;-            } else {-                old_pos = replace(pos, old_pos);-            }-        });+        match self.find_equivalent(hash, key) {+            Some(raw_bucket) => Some(unsafe { raw_bucket.read() }),+            None => None,+        }     }      pub(crate) fn insert_full(&mut self, hash: HashValue, key: K, value: V) -> (usize, Option<V>)     where         K: Eq,     {-        self.reserve_one();-        dispatch_32_vs_64!(self.insert_phase_1::<_>(hash, key, InsertValue(value)))+        match self.get_index_of(hash, &key) {+            Some(i) => (i, Some(replace(&mut self.entries[i].value, value))),+            None => (self.push(hash, key, value), None),+        }     }      pub(crate) fn entry(&mut self, hash: HashValue, key: K) -> Entry<K, V>     where         K: Eq,     {-        self.reserve_one();-        dispatch_32_vs_64!(self.insert_phase_1::<_>(hash, key, MakeEntry))-    }--    /// Return probe (indices) and position (entries)-    pub(crate) fn find_using<F>(&self, hash: HashValue, key_eq: F) -> Option<(usize, usize)>+        match self.find_equivalent(hash, &key) {+            Some(raw_bucket) => Entry::Occupied(OccupiedEntry {+                map: self,+                raw_bucket,

On a closer look, fn shift_remove_bucket(&mut self, raw_bucket: RawBucket) also runs into the same question, of course.

cuviper

comment created time in 2 months

pull request commentbluss/indexmap

Switch to hashbrown's RawTable internally

There isn't much unsafe code, that's nice, but the questions of simultaneous reference and raw pointer validity stump me at the moment, so it's not trivial code.

cuviper

comment created time in 2 months

Pull request review commentbluss/indexmap

Switch to hashbrown's RawTable internally

 impl<K, V> Entries for IndexMapCore<K, V> {     where         F: FnOnce(&mut [Self::Entry]),     {-        let side_index = self.save_hash_index();         f(&mut self.entries);-        self.restore_hash_index(side_index);+        self.rebuild_hash_table();     } }  impl<K, V> IndexMapCore<K, V> {     #[inline]     pub(crate) fn new() -> Self {         IndexMapCore {-            mask: 0,-            indices: Box::new([]),+            indices: RawTable::new(),             entries: Vec::new(),         }     }      #[inline]     pub(crate) fn with_capacity(n: usize) -> Self {-        let raw = to_raw_capacity(n);-        let raw_cap = max(raw.next_power_of_two(), 8);         IndexMapCore {-            mask: raw_cap.wrapping_sub(1),-            indices: vec![Pos::none(); raw_cap].into_boxed_slice(),-            entries: Vec::with_capacity(usable_capacity(raw_cap)),+            indices: RawTable::with_capacity(n),+            entries: Vec::with_capacity(n),         }     } -    // Return whether we need 32 or 64 bits to specify a bucket or entry index-    #[cfg(not(feature = "test_low_transition_point"))]-    fn size_class_is_64bit(&self) -> bool {-        usize::max_value() > u32::max_value() as usize-            && self.raw_capacity() >= u32::max_value() as usize-    }--    // for testing-    #[cfg(feature = "test_low_transition_point")]-    fn size_class_is_64bit(&self) -> bool {-        self.raw_capacity() >= 64-    }--    #[inline(always)]-    fn raw_capacity(&self) -> usize {-        self.indices.len()-    }-+    #[inline]     pub(crate) fn len(&self) -> usize {-        self.entries.len()+        self.indices.len()     } +    #[inline]     pub(crate) fn capacity(&self) -> usize {-        usable_capacity(self.raw_capacity())+        cmp::min(self.indices.capacity(), self.entries.capacity())     }      pub(crate) fn clear(&mut self) {+        self.indices.clear_no_drop();         self.entries.clear();-        self.clear_indices();     }      pub(crate) fn drain(&mut self, range: RangeFull) -> Drain<Bucket<K, V>> {-        self.clear_indices();+        self.indices.clear_no_drop();         self.entries.drain(range)     } -    // clear self.indices to the same state as "no elements"-    fn clear_indices(&mut self) {-        for pos in self.indices.iter_mut() {-            *pos = Pos::none();-        }+    /// Reserve capacity for `additional` more key-value pairs.+    pub(crate) fn reserve(&mut self, additional: usize) {+        self.indices.reserve(additional, get_hash(&self.entries));+        self.reserve_entries();     } -    fn first_allocation(&mut self) {-        debug_assert_eq!(self.len(), 0);-        let raw_cap = 8usize;-        self.mask = raw_cap.wrapping_sub(1);-        self.indices = vec![Pos::none(); raw_cap].into_boxed_slice();-        self.entries = Vec::with_capacity(usable_capacity(raw_cap));+    /// Reserve entries capacity to match the indices+    fn reserve_entries(&mut self) {+        let additional = self.indices.capacity() - self.entries.len();+        self.entries.reserve_exact(additional);     } -    pub(crate) fn reserve_one(&mut self) {-        if self.len() == self.capacity() {-            dispatch_32_vs_64!(self.double_capacity());-        }+    /// Shrink the capacity of the map as much as possible.+    pub(crate) fn shrink_to_fit(&mut self) {+        self.indices.shrink_to(0, get_hash(&self.entries));+        self.entries.shrink_to_fit();     } -    #[inline(never)]-    // `Sz` is *current* Size class, before grow-    fn double_capacity<Sz>(&mut self)-    where-        Sz: Size,-    {-        debug_assert!(self.raw_capacity() == 0 || self.len() > 0);-        if self.raw_capacity() == 0 {-            return self.first_allocation();-        }--        // find first ideally placed element -- start of cluster-        let mut first_ideal = 0;-        for (i, index) in enumerate(&*self.indices) {-            if let Some(pos) = index.pos() {-                if 0 == probe_distance(self.mask, self.entries[pos].hash, i) {-                    first_ideal = i;-                    break;-                }-            }-        }--        // visit the entries in an order where we can simply reinsert them-        // into self.indices without any bucket stealing.-        let new_raw_cap = self.indices.len() * 2;-        let old_indices = replace(-            &mut self.indices,-            vec![Pos::none(); new_raw_cap].into_boxed_slice(),-        );-        self.mask = new_raw_cap.wrapping_sub(1);--        // `Sz` is the old size class, and either u32 or u64 is the new-        for &pos in &old_indices[first_ideal..] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));-        }--        for &pos in &old_indices[..first_ideal] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));+    /// Remove the last key-value pair+    pub(crate) fn pop(&mut self) -> Option<(K, V)> {+        if let Some(entry) = self.entries.pop() {+            let last = self.entries.len();+            let raw_bucket = self.find_index(entry.hash, last).unwrap();+            unsafe { self.indices.erase_no_drop(&raw_bucket) };+            Some((entry.key, entry.value))+        } else {+            None         }-        let more = self.capacity() - self.len();-        self.entries.reserve_exact(more);     } -    // write to self.indices-    // read from self.entries at `pos`-    //-    // reinserting rewrites all `Pos` entries anyway. This handles transitioning-    // from u32 to u64 size class if needed by using the two type parameters.-    fn reinsert_entry_in_order<SzNew, SzOld>(&mut self, pos: Pos)-    where-        SzNew: Size,-        SzOld: Size,-    {-        if let Some((i, hash_proxy)) = pos.resolve::<SzOld>() {-            // only if the size class is conserved can we use the short hash-            let entry_hash = if SzOld::is_same_size::<SzNew>() {-                hash_proxy.get_short_hash(&self.entries, i).into_hash()-            } else {-                self.entries[i].hash-            };-            // find first empty bucket and insert there-            let mut probe = desired_pos(self.mask, entry_hash);-            probe_loop!(probe < self.indices.len(), {-                if self.indices[probe].is_none() {-                    // empty bucket, insert here-                    self.indices[probe] = Pos::with_hash::<SzNew>(i, entry_hash);-                    return;-                }-            });+    /// Append a key-value pair, *without* checking whether it already exists.+    fn push(&mut self, hash: HashValue, key: K, value: V) -> usize {+        let i = self.entries.len();+        self.indices.insert(hash.get(), i, get_hash(&self.entries));+        if i == self.entries.capacity() {+            // Reserve our own capacity synced to the indices,+            // rather than letting `Vec::push` just double it.+            self.reserve_entries();         }+        self.entries.push(Bucket { hash, key, value });+        i     } -    pub(crate) fn pop_impl(&mut self) -> Option<(K, V)> {-        let (probe, found) = match self.as_entries().last() {-            Some(e) => self.find_existing_entry(e),-            None => return None,-        };-        debug_assert_eq!(found, self.entries.len() - 1);-        Some(self.swap_remove_found(probe, found))-    }--    fn insert_phase_1<'a, Sz, A>(&'a mut self, hash: HashValue, key: K, action: A) -> A::Output+    /// Return the index in `entries` where an equivalent key can be found+    pub(crate) fn get_index_of<Q>(&self, hash: HashValue, key: &Q) -> Option<usize>     where-        Sz: Size,-        K: Eq,-        A: ProbeAction<'a, Sz, K, V>,-    {-        let mut probe = desired_pos(self.mask, hash);-        let mut dist = 0;-        debug_assert!(self.len() < self.raw_capacity());-        probe_loop!(probe < self.indices.len(), {-            if let Some((i, hash_proxy)) = self.indices[probe].resolve::<Sz>() {-                let entry_hash = hash_proxy.get_short_hash(&self.entries, i);-                // if existing element probed less than us, swap-                let their_dist = probe_distance(self.mask, entry_hash.into_hash(), probe);-                if their_dist < dist {-                    // robin hood: steal the spot if it's better for us-                    return action.steal(VacantEntry {-                        map: self,-                        hash: hash,-                        key: key,-                        probe: probe,-                    });-                } else if entry_hash == hash && self.entries[i].key == key {-                    return action.hit(OccupiedEntry {-                        map: self,-                        key: key,-                        probe: probe,-                        index: i,-                    });-                }-            } else {-                // empty bucket, insert here-                return action.empty(VacantEntry {-                    map: self,-                    hash: hash,-                    key: key,-                    probe: probe,-                });-            }-            dist += 1;-        });-    }--    /// phase 2 is post-insert where we forward-shift `Pos` in the indices.-    fn insert_phase_2<Sz>(&mut self, mut probe: usize, mut old_pos: Pos)-    where-        Sz: Size,+        Q: ?Sized + Equivalent<K>,     {-        probe_loop!(probe < self.indices.len(), {-            let pos = &mut self.indices[probe];-            if pos.is_none() {-                *pos = old_pos;-                break;-            } else {-                old_pos = replace(pos, old_pos);-            }-        });+        match self.find_equivalent(hash, key) {+            Some(raw_bucket) => Some(unsafe { raw_bucket.read() }),+            None => None,+        }     }      pub(crate) fn insert_full(&mut self, hash: HashValue, key: K, value: V) -> (usize, Option<V>)     where         K: Eq,     {-        self.reserve_one();-        dispatch_32_vs_64!(self.insert_phase_1::<_>(hash, key, InsertValue(value)))+        match self.get_index_of(hash, &key) {+            Some(i) => (i, Some(replace(&mut self.entries[i].value, value))),+            None => (self.push(hash, key, value), None),+        }     }      pub(crate) fn entry(&mut self, hash: HashValue, key: K) -> Entry<K, V>     where         K: Eq,     {-        self.reserve_one();-        dispatch_32_vs_64!(self.insert_phase_1::<_>(hash, key, MakeEntry))-    }--    /// Return probe (indices) and position (entries)-    pub(crate) fn find_using<F>(&self, hash: HashValue, key_eq: F) -> Option<(usize, usize)>+        match self.find_equivalent(hash, &key) {+            Some(raw_bucket) => Entry::Occupied(OccupiedEntry {+                map: self,+                raw_bucket,+                key,+            }),+            None => Entry::Vacant(VacantEntry {+                map: self,+                hash,+                key,+            }),+        }+    }++    /// Return the raw bucket with an equivalent key+    fn find_equivalent<Q>(&self, hash: HashValue, key: &Q) -> Option<RawBucket>     where-        F: Fn(&Bucket<K, V>) -> bool,+        Q: ?Sized + Equivalent<K>,     {-        dispatch_32_vs_64!(self.find_using_impl::<_>(hash, key_eq))+        self.indices.find(hash.get(), {+            |&i| Q::equivalent(key, &self.entries[i].key)+        })     } -    fn find_using_impl<Sz, F>(&self, hash: HashValue, key_eq: F) -> Option<(usize, usize)>-    where-        F: Fn(&Bucket<K, V>) -> bool,-        Sz: Size,-    {-        debug_assert!(self.len() > 0);-        let mut probe = desired_pos(self.mask, hash);-        let mut dist = 0;-        probe_loop!(probe < self.indices.len(), {-            if let Some((i, hash_proxy)) = self.indices[probe].resolve::<Sz>() {-                let entry_hash = hash_proxy.get_short_hash(&self.entries, i);-                if dist > probe_distance(self.mask, entry_hash.into_hash(), probe) {-                    // give up when probe distance is too long-                    return None;-                } else if entry_hash == hash && key_eq(&self.entries[i]) {-                    return Some((probe, i));-                }-            } else {-                return None;-            }-            dist += 1;-        });+    /// Return the raw bucket for the given index+    fn find_index(&self, hash: HashValue, index: usize) -> Option<RawBucket> {+        self.indices.find(hash.get(), |&i| i == index)     } -    /// Find `entry` which is already placed inside self.entries;-    /// return its probe and entry index.-    pub(crate) fn find_existing_entry(&self, entry: &Bucket<K, V>) -> (usize, usize) {-        debug_assert!(self.len() > 0);--        let hash = entry.hash;-        let actual_pos = ptrdistance(&self.entries[0], entry);-        let probe = dispatch_32_vs_64!(self =>-            find_existing_entry_at(&self.indices, hash, self.mask, actual_pos));-        (probe, actual_pos)+    /// Remove an entry by shifting all entries that follow it+    pub(crate) fn shift_remove_full<Q>(&mut self, hash: HashValue, key: &Q) -> Option<(usize, K, V)>+    where+        Q: ?Sized + Equivalent<K>,+    {+        match self.find_equivalent(hash, key) {+            Some(raw_bucket) => Some(self.shift_remove_bucket(raw_bucket)),+            None => None,+        }     }      /// Remove an entry by shifting all entries that follow it-    pub(crate) fn shift_remove_found(&mut self, probe: usize, found: usize) -> (K, V) {-        dispatch_32_vs_64!(self.shift_remove_found_impl(probe, found))+    pub(crate) fn shift_remove_index(&mut self, index: usize) -> Option<(K, V)> {+        let raw_bucket = match self.entries.get(index) {+            Some(entry) => self.find_index(entry.hash, index).unwrap(),+            None => return None,+        };+        let (_, key, value) = self.shift_remove_bucket(raw_bucket);+        Some((key, value))     } -    fn shift_remove_found_impl<Sz>(&mut self, probe: usize, found: usize) -> (K, V)-    where-        Sz: Size,-    {-        // index `probe` and entry `found` is to be removed+    /// Remove an entry by shifting all entries that follow it+    fn shift_remove_bucket(&mut self, raw_bucket: RawBucket) -> (usize, K, V) {         // use Vec::remove, but then we need to update the indices that point         // to all of the other entries that have to move-        self.indices[probe] = Pos::none();-        let entry = self.entries.remove(found);+        let index = unsafe {+            self.indices.erase_no_drop(&raw_bucket);+            raw_bucket.read()+        };+        let entry = self.entries.remove(index);          // correct indices that point to the entries that followed the removed entry.-        // use a heuristic between a full sweep vs. a `probe_loop!` for every shifted item.-        if self.indices.len() < (self.entries.len() - found) * 2 {-            // shift all indices greater than `found`-            for pos in self.indices.iter_mut() {-                if let Some((i, _)) = pos.resolve::<Sz>() {-                    if i > found {-                        // shift the index-                        pos.set_pos::<Sz>(i - 1);+        // use a heuristic between a full sweep vs. a `find()` for every shifted item.+        let raw_capacity = self.indices.buckets();+        let shifted_entries = &self.entries[index..];+        if shifted_entries.len() > raw_capacity / 2 {+            // shift all indices greater than `index`+            unsafe {+                for raw_bucket in self.indices.iter() {

raw_bucket here shadows the parameter raw_bucket in a way that obscures the logic, so I think it should use a different name. Same in the else branch.

cuviper

comment created time in 2 months

Pull request review commentbluss/indexmap

Switch to hashbrown's RawTable internally

 impl<K, V> Entries for IndexMapCore<K, V> {     where         F: FnOnce(&mut [Self::Entry]),     {-        let side_index = self.save_hash_index();         f(&mut self.entries);-        self.restore_hash_index(side_index);+        self.rebuild_hash_table();     } }  impl<K, V> IndexMapCore<K, V> {     #[inline]     pub(crate) fn new() -> Self {         IndexMapCore {-            mask: 0,-            indices: Box::new([]),+            indices: RawTable::new(),             entries: Vec::new(),         }     }      #[inline]     pub(crate) fn with_capacity(n: usize) -> Self {-        let raw = to_raw_capacity(n);-        let raw_cap = max(raw.next_power_of_two(), 8);         IndexMapCore {-            mask: raw_cap.wrapping_sub(1),-            indices: vec![Pos::none(); raw_cap].into_boxed_slice(),-            entries: Vec::with_capacity(usable_capacity(raw_cap)),+            indices: RawTable::with_capacity(n),+            entries: Vec::with_capacity(n),         }     } -    // Return whether we need 32 or 64 bits to specify a bucket or entry index-    #[cfg(not(feature = "test_low_transition_point"))]-    fn size_class_is_64bit(&self) -> bool {-        usize::max_value() > u32::max_value() as usize-            && self.raw_capacity() >= u32::max_value() as usize-    }--    // for testing-    #[cfg(feature = "test_low_transition_point")]-    fn size_class_is_64bit(&self) -> bool {-        self.raw_capacity() >= 64-    }--    #[inline(always)]-    fn raw_capacity(&self) -> usize {-        self.indices.len()-    }-+    #[inline]     pub(crate) fn len(&self) -> usize {-        self.entries.len()+        self.indices.len()     } +    #[inline]     pub(crate) fn capacity(&self) -> usize {-        usable_capacity(self.raw_capacity())+        cmp::min(self.indices.capacity(), self.entries.capacity())     }      pub(crate) fn clear(&mut self) {+        self.indices.clear_no_drop();         self.entries.clear();-        self.clear_indices();     }      pub(crate) fn drain(&mut self, range: RangeFull) -> Drain<Bucket<K, V>> {-        self.clear_indices();+        self.indices.clear_no_drop();         self.entries.drain(range)     } -    // clear self.indices to the same state as "no elements"-    fn clear_indices(&mut self) {-        for pos in self.indices.iter_mut() {-            *pos = Pos::none();-        }+    /// Reserve capacity for `additional` more key-value pairs.+    pub(crate) fn reserve(&mut self, additional: usize) {+        self.indices.reserve(additional, get_hash(&self.entries));+        self.reserve_entries();     } -    fn first_allocation(&mut self) {-        debug_assert_eq!(self.len(), 0);-        let raw_cap = 8usize;-        self.mask = raw_cap.wrapping_sub(1);-        self.indices = vec![Pos::none(); raw_cap].into_boxed_slice();-        self.entries = Vec::with_capacity(usable_capacity(raw_cap));+    /// Reserve entries capacity to match the indices+    fn reserve_entries(&mut self) {+        let additional = self.indices.capacity() - self.entries.len();+        self.entries.reserve_exact(additional);     } -    pub(crate) fn reserve_one(&mut self) {-        if self.len() == self.capacity() {-            dispatch_32_vs_64!(self.double_capacity());-        }+    /// Shrink the capacity of the map as much as possible.+    pub(crate) fn shrink_to_fit(&mut self) {+        self.indices.shrink_to(0, get_hash(&self.entries));+        self.entries.shrink_to_fit();     } -    #[inline(never)]-    // `Sz` is *current* Size class, before grow-    fn double_capacity<Sz>(&mut self)-    where-        Sz: Size,-    {-        debug_assert!(self.raw_capacity() == 0 || self.len() > 0);-        if self.raw_capacity() == 0 {-            return self.first_allocation();-        }--        // find first ideally placed element -- start of cluster-        let mut first_ideal = 0;-        for (i, index) in enumerate(&*self.indices) {-            if let Some(pos) = index.pos() {-                if 0 == probe_distance(self.mask, self.entries[pos].hash, i) {-                    first_ideal = i;-                    break;-                }-            }-        }--        // visit the entries in an order where we can simply reinsert them-        // into self.indices without any bucket stealing.-        let new_raw_cap = self.indices.len() * 2;-        let old_indices = replace(-            &mut self.indices,-            vec![Pos::none(); new_raw_cap].into_boxed_slice(),-        );-        self.mask = new_raw_cap.wrapping_sub(1);--        // `Sz` is the old size class, and either u32 or u64 is the new-        for &pos in &old_indices[first_ideal..] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));-        }--        for &pos in &old_indices[..first_ideal] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));+    /// Remove the last key-value pair+    pub(crate) fn pop(&mut self) -> Option<(K, V)> {+        if let Some(entry) = self.entries.pop() {+            let last = self.entries.len();+            let raw_bucket = self.find_index(entry.hash, last).unwrap();+            unsafe { self.indices.erase_no_drop(&raw_bucket) };+            Some((entry.key, entry.value))+        } else {+            None         }-        let more = self.capacity() - self.len();-        self.entries.reserve_exact(more);     } -    // write to self.indices-    // read from self.entries at `pos`-    //-    // reinserting rewrites all `Pos` entries anyway. This handles transitioning-    // from u32 to u64 size class if needed by using the two type parameters.-    fn reinsert_entry_in_order<SzNew, SzOld>(&mut self, pos: Pos)-    where-        SzNew: Size,-        SzOld: Size,-    {-        if let Some((i, hash_proxy)) = pos.resolve::<SzOld>() {-            // only if the size class is conserved can we use the short hash-            let entry_hash = if SzOld::is_same_size::<SzNew>() {-                hash_proxy.get_short_hash(&self.entries, i).into_hash()-            } else {-                self.entries[i].hash-            };-            // find first empty bucket and insert there-            let mut probe = desired_pos(self.mask, entry_hash);-            probe_loop!(probe < self.indices.len(), {-                if self.indices[probe].is_none() {-                    // empty bucket, insert here-                    self.indices[probe] = Pos::with_hash::<SzNew>(i, entry_hash);-                    return;-                }-            });+    /// Append a key-value pair, *without* checking whether it already exists.+    fn push(&mut self, hash: HashValue, key: K, value: V) -> usize {+        let i = self.entries.len();+        self.indices.insert(hash.get(), i, get_hash(&self.entries));+        if i == self.entries.capacity() {+            // Reserve our own capacity synced to the indices,+            // rather than letting `Vec::push` just double it.+            self.reserve_entries();         }+        self.entries.push(Bucket { hash, key, value });+        i     } -    pub(crate) fn pop_impl(&mut self) -> Option<(K, V)> {-        let (probe, found) = match self.as_entries().last() {-            Some(e) => self.find_existing_entry(e),-            None => return None,-        };-        debug_assert_eq!(found, self.entries.len() - 1);-        Some(self.swap_remove_found(probe, found))-    }--    fn insert_phase_1<'a, Sz, A>(&'a mut self, hash: HashValue, key: K, action: A) -> A::Output+    /// Return the index in `entries` where an equivalent key can be found+    pub(crate) fn get_index_of<Q>(&self, hash: HashValue, key: &Q) -> Option<usize>     where-        Sz: Size,-        K: Eq,-        A: ProbeAction<'a, Sz, K, V>,-    {-        let mut probe = desired_pos(self.mask, hash);-        let mut dist = 0;-        debug_assert!(self.len() < self.raw_capacity());-        probe_loop!(probe < self.indices.len(), {-            if let Some((i, hash_proxy)) = self.indices[probe].resolve::<Sz>() {-                let entry_hash = hash_proxy.get_short_hash(&self.entries, i);-                // if existing element probed less than us, swap-                let their_dist = probe_distance(self.mask, entry_hash.into_hash(), probe);-                if their_dist < dist {-                    // robin hood: steal the spot if it's better for us-                    return action.steal(VacantEntry {-                        map: self,-                        hash: hash,-                        key: key,-                        probe: probe,-                    });-                } else if entry_hash == hash && self.entries[i].key == key {-                    return action.hit(OccupiedEntry {-                        map: self,-                        key: key,-                        probe: probe,-                        index: i,-                    });-                }-            } else {-                // empty bucket, insert here-                return action.empty(VacantEntry {-                    map: self,-                    hash: hash,-                    key: key,-                    probe: probe,-                });-            }-            dist += 1;-        });-    }--    /// phase 2 is post-insert where we forward-shift `Pos` in the indices.-    fn insert_phase_2<Sz>(&mut self, mut probe: usize, mut old_pos: Pos)-    where-        Sz: Size,+        Q: ?Sized + Equivalent<K>,     {-        probe_loop!(probe < self.indices.len(), {-            let pos = &mut self.indices[probe];-            if pos.is_none() {-                *pos = old_pos;-                break;-            } else {-                old_pos = replace(pos, old_pos);-            }-        });+        match self.find_equivalent(hash, key) {+            Some(raw_bucket) => Some(unsafe { raw_bucket.read() }),+            None => None,+        }     }      pub(crate) fn insert_full(&mut self, hash: HashValue, key: K, value: V) -> (usize, Option<V>)     where         K: Eq,     {-        self.reserve_one();-        dispatch_32_vs_64!(self.insert_phase_1::<_>(hash, key, InsertValue(value)))+        match self.get_index_of(hash, &key) {+            Some(i) => (i, Some(replace(&mut self.entries[i].value, value))),+            None => (self.push(hash, key, value), None),+        }     }      pub(crate) fn entry(&mut self, hash: HashValue, key: K) -> Entry<K, V>     where         K: Eq,     {-        self.reserve_one();-        dispatch_32_vs_64!(self.insert_phase_1::<_>(hash, key, MakeEntry))-    }--    /// Return probe (indices) and position (entries)-    pub(crate) fn find_using<F>(&self, hash: HashValue, key_eq: F) -> Option<(usize, usize)>+        match self.find_equivalent(hash, &key) {+            Some(raw_bucket) => Entry::Occupied(OccupiedEntry {+                map: self,+                raw_bucket,

I don't think this is very easy to analyze but here we have map: &'a mut IndexMapCore<K, V> and a RawBucket and the raw bucket is used as a mutable pointer into something that is part of the map, on the surface level this is a borrowing error, and I don't know the rule that would allow this, off hand.

cuviper

comment created time in 2 months

Pull request review commentbluss/indexmap

Switch to hashbrown's RawTable internally

 impl<K, V> Entries for IndexMapCore<K, V> {     where         F: FnOnce(&mut [Self::Entry]),     {-        let side_index = self.save_hash_index();         f(&mut self.entries);-        self.restore_hash_index(side_index);+        self.rebuild_hash_table();     } }  impl<K, V> IndexMapCore<K, V> {     #[inline]     pub(crate) fn new() -> Self {         IndexMapCore {-            mask: 0,-            indices: Box::new([]),+            indices: RawTable::new(),             entries: Vec::new(),         }     }      #[inline]     pub(crate) fn with_capacity(n: usize) -> Self {-        let raw = to_raw_capacity(n);-        let raw_cap = max(raw.next_power_of_two(), 8);         IndexMapCore {-            mask: raw_cap.wrapping_sub(1),-            indices: vec![Pos::none(); raw_cap].into_boxed_slice(),-            entries: Vec::with_capacity(usable_capacity(raw_cap)),+            indices: RawTable::with_capacity(n),+            entries: Vec::with_capacity(n),         }     } -    // Return whether we need 32 or 64 bits to specify a bucket or entry index-    #[cfg(not(feature = "test_low_transition_point"))]-    fn size_class_is_64bit(&self) -> bool {-        usize::max_value() > u32::max_value() as usize-            && self.raw_capacity() >= u32::max_value() as usize-    }--    // for testing-    #[cfg(feature = "test_low_transition_point")]-    fn size_class_is_64bit(&self) -> bool {-        self.raw_capacity() >= 64-    }--    #[inline(always)]-    fn raw_capacity(&self) -> usize {-        self.indices.len()-    }-+    #[inline]     pub(crate) fn len(&self) -> usize {-        self.entries.len()+        self.indices.len()     } +    #[inline]     pub(crate) fn capacity(&self) -> usize {-        usable_capacity(self.raw_capacity())+        cmp::min(self.indices.capacity(), self.entries.capacity())     }      pub(crate) fn clear(&mut self) {+        self.indices.clear_no_drop();         self.entries.clear();-        self.clear_indices();     }      pub(crate) fn drain(&mut self, range: RangeFull) -> Drain<Bucket<K, V>> {-        self.clear_indices();+        self.indices.clear_no_drop();         self.entries.drain(range)     } -    // clear self.indices to the same state as "no elements"-    fn clear_indices(&mut self) {-        for pos in self.indices.iter_mut() {-            *pos = Pos::none();-        }+    /// Reserve capacity for `additional` more key-value pairs.+    pub(crate) fn reserve(&mut self, additional: usize) {+        self.indices.reserve(additional, get_hash(&self.entries));+        self.reserve_entries();     } -    fn first_allocation(&mut self) {-        debug_assert_eq!(self.len(), 0);-        let raw_cap = 8usize;-        self.mask = raw_cap.wrapping_sub(1);-        self.indices = vec![Pos::none(); raw_cap].into_boxed_slice();-        self.entries = Vec::with_capacity(usable_capacity(raw_cap));+    /// Reserve entries capacity to match the indices+    fn reserve_entries(&mut self) {+        let additional = self.indices.capacity() - self.entries.len();+        self.entries.reserve_exact(additional);     } -    pub(crate) fn reserve_one(&mut self) {-        if self.len() == self.capacity() {-            dispatch_32_vs_64!(self.double_capacity());-        }+    /// Shrink the capacity of the map as much as possible.+    pub(crate) fn shrink_to_fit(&mut self) {+        self.indices.shrink_to(0, get_hash(&self.entries));+        self.entries.shrink_to_fit();     } -    #[inline(never)]-    // `Sz` is *current* Size class, before grow-    fn double_capacity<Sz>(&mut self)-    where-        Sz: Size,-    {-        debug_assert!(self.raw_capacity() == 0 || self.len() > 0);-        if self.raw_capacity() == 0 {-            return self.first_allocation();-        }--        // find first ideally placed element -- start of cluster-        let mut first_ideal = 0;-        for (i, index) in enumerate(&*self.indices) {-            if let Some(pos) = index.pos() {-                if 0 == probe_distance(self.mask, self.entries[pos].hash, i) {-                    first_ideal = i;-                    break;-                }-            }-        }--        // visit the entries in an order where we can simply reinsert them-        // into self.indices without any bucket stealing.-        let new_raw_cap = self.indices.len() * 2;-        let old_indices = replace(-            &mut self.indices,-            vec![Pos::none(); new_raw_cap].into_boxed_slice(),-        );-        self.mask = new_raw_cap.wrapping_sub(1);--        // `Sz` is the old size class, and either u32 or u64 is the new-        for &pos in &old_indices[first_ideal..] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));-        }--        for &pos in &old_indices[..first_ideal] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));+    /// Remove the last key-value pair+    pub(crate) fn pop(&mut self) -> Option<(K, V)> {+        if let Some(entry) = self.entries.pop() {+            let last = self.entries.len();+            let raw_bucket = self.find_index(entry.hash, last).unwrap();+            unsafe { self.indices.erase_no_drop(&raw_bucket) };+            Some((entry.key, entry.value))+        } else {+            None         }-        let more = self.capacity() - self.len();-        self.entries.reserve_exact(more);     } -    // write to self.indices-    // read from self.entries at `pos`-    //-    // reinserting rewrites all `Pos` entries anyway. This handles transitioning-    // from u32 to u64 size class if needed by using the two type parameters.-    fn reinsert_entry_in_order<SzNew, SzOld>(&mut self, pos: Pos)-    where-        SzNew: Size,-        SzOld: Size,-    {-        if let Some((i, hash_proxy)) = pos.resolve::<SzOld>() {-            // only if the size class is conserved can we use the short hash-            let entry_hash = if SzOld::is_same_size::<SzNew>() {-                hash_proxy.get_short_hash(&self.entries, i).into_hash()-            } else {-                self.entries[i].hash-            };-            // find first empty bucket and insert there-            let mut probe = desired_pos(self.mask, entry_hash);-            probe_loop!(probe < self.indices.len(), {-                if self.indices[probe].is_none() {-                    // empty bucket, insert here-                    self.indices[probe] = Pos::with_hash::<SzNew>(i, entry_hash);-                    return;-                }-            });+    /// Append a key-value pair, *without* checking whether it already exists.+    fn push(&mut self, hash: HashValue, key: K, value: V) -> usize {+        let i = self.entries.len();+        self.indices.insert(hash.get(), i, get_hash(&self.entries));+        if i == self.entries.capacity() {+            // Reserve our own capacity synced to the indices,+            // rather than letting `Vec::push` just double it.+            self.reserve_entries();         }+        self.entries.push(Bucket { hash, key, value });+        i     } -    pub(crate) fn pop_impl(&mut self) -> Option<(K, V)> {-        let (probe, found) = match self.as_entries().last() {-            Some(e) => self.find_existing_entry(e),-            None => return None,-        };-        debug_assert_eq!(found, self.entries.len() - 1);-        Some(self.swap_remove_found(probe, found))-    }--    fn insert_phase_1<'a, Sz, A>(&'a mut self, hash: HashValue, key: K, action: A) -> A::Output+    /// Return the index in `entries` where an equivalent key can be found+    pub(crate) fn get_index_of<Q>(&self, hash: HashValue, key: &Q) -> Option<usize>     where-        Sz: Size,-        K: Eq,-        A: ProbeAction<'a, Sz, K, V>,-    {-        let mut probe = desired_pos(self.mask, hash);-        let mut dist = 0;-        debug_assert!(self.len() < self.raw_capacity());-        probe_loop!(probe < self.indices.len(), {-            if let Some((i, hash_proxy)) = self.indices[probe].resolve::<Sz>() {-                let entry_hash = hash_proxy.get_short_hash(&self.entries, i);-                // if existing element probed less than us, swap-                let their_dist = probe_distance(self.mask, entry_hash.into_hash(), probe);-                if their_dist < dist {-                    // robin hood: steal the spot if it's better for us-                    return action.steal(VacantEntry {-                        map: self,-                        hash: hash,-                        key: key,-                        probe: probe,-                    });-                } else if entry_hash == hash && self.entries[i].key == key {-                    return action.hit(OccupiedEntry {-                        map: self,-                        key: key,-                        probe: probe,-                        index: i,-                    });-                }-            } else {-                // empty bucket, insert here-                return action.empty(VacantEntry {-                    map: self,-                    hash: hash,-                    key: key,-                    probe: probe,-                });-            }-            dist += 1;-        });-    }--    /// phase 2 is post-insert where we forward-shift `Pos` in the indices.-    fn insert_phase_2<Sz>(&mut self, mut probe: usize, mut old_pos: Pos)-    where-        Sz: Size,+        Q: ?Sized + Equivalent<K>,     {-        probe_loop!(probe < self.indices.len(), {-            let pos = &mut self.indices[probe];-            if pos.is_none() {-                *pos = old_pos;-                break;-            } else {-                old_pos = replace(pos, old_pos);-            }-        });+        match self.find_equivalent(hash, key) {+            Some(raw_bucket) => Some(unsafe { raw_bucket.read() }),+            None => None,+        }     }      pub(crate) fn insert_full(&mut self, hash: HashValue, key: K, value: V) -> (usize, Option<V>)     where         K: Eq,     {-        self.reserve_one();-        dispatch_32_vs_64!(self.insert_phase_1::<_>(hash, key, InsertValue(value)))+        match self.get_index_of(hash, &key) {+            Some(i) => (i, Some(replace(&mut self.entries[i].value, value))),+            None => (self.push(hash, key, value), None),+        }     }      pub(crate) fn entry(&mut self, hash: HashValue, key: K) -> Entry<K, V>     where         K: Eq,     {-        self.reserve_one();-        dispatch_32_vs_64!(self.insert_phase_1::<_>(hash, key, MakeEntry))-    }--    /// Return probe (indices) and position (entries)-    pub(crate) fn find_using<F>(&self, hash: HashValue, key_eq: F) -> Option<(usize, usize)>+        match self.find_equivalent(hash, &key) {+            Some(raw_bucket) => Entry::Occupied(OccupiedEntry {+                map: self,+                raw_bucket,+                key,+            }),+            None => Entry::Vacant(VacantEntry {+                map: self,+                hash,+                key,+            }),+        }+    }++    /// Return the raw bucket with an equivalent key+    fn find_equivalent<Q>(&self, hash: HashValue, key: &Q) -> Option<RawBucket>     where-        F: Fn(&Bucket<K, V>) -> bool,+        Q: ?Sized + Equivalent<K>,     {-        dispatch_32_vs_64!(self.find_using_impl::<_>(hash, key_eq))+        self.indices.find(hash.get(), {+            |&i| Q::equivalent(key, &self.entries[i].key)+        })     } -    fn find_using_impl<Sz, F>(&self, hash: HashValue, key_eq: F) -> Option<(usize, usize)>-    where-        F: Fn(&Bucket<K, V>) -> bool,-        Sz: Size,-    {-        debug_assert!(self.len() > 0);-        let mut probe = desired_pos(self.mask, hash);-        let mut dist = 0;-        probe_loop!(probe < self.indices.len(), {-            if let Some((i, hash_proxy)) = self.indices[probe].resolve::<Sz>() {-                let entry_hash = hash_proxy.get_short_hash(&self.entries, i);-                if dist > probe_distance(self.mask, entry_hash.into_hash(), probe) {-                    // give up when probe distance is too long-                    return None;-                } else if entry_hash == hash && key_eq(&self.entries[i]) {-                    return Some((probe, i));-                }-            } else {-                return None;-            }-            dist += 1;-        });+    /// Return the raw bucket for the given index+    fn find_index(&self, hash: HashValue, index: usize) -> Option<RawBucket> {+        self.indices.find(hash.get(), |&i| i == index)     } -    /// Find `entry` which is already placed inside self.entries;-    /// return its probe and entry index.-    pub(crate) fn find_existing_entry(&self, entry: &Bucket<K, V>) -> (usize, usize) {-        debug_assert!(self.len() > 0);--        let hash = entry.hash;-        let actual_pos = ptrdistance(&self.entries[0], entry);-        let probe = dispatch_32_vs_64!(self =>-            find_existing_entry_at(&self.indices, hash, self.mask, actual_pos));-        (probe, actual_pos)+    /// Remove an entry by shifting all entries that follow it+    pub(crate) fn shift_remove_full<Q>(&mut self, hash: HashValue, key: &Q) -> Option<(usize, K, V)>+    where+        Q: ?Sized + Equivalent<K>,+    {+        match self.find_equivalent(hash, key) {+            Some(raw_bucket) => Some(self.shift_remove_bucket(raw_bucket)),+            None => None,+        }     }      /// Remove an entry by shifting all entries that follow it-    pub(crate) fn shift_remove_found(&mut self, probe: usize, found: usize) -> (K, V) {-        dispatch_32_vs_64!(self.shift_remove_found_impl(probe, found))+    pub(crate) fn shift_remove_index(&mut self, index: usize) -> Option<(K, V)> {+        let raw_bucket = match self.entries.get(index) {+            Some(entry) => self.find_index(entry.hash, index).unwrap(),+            None => return None,+        };+        let (_, key, value) = self.shift_remove_bucket(raw_bucket);+        Some((key, value))     } -    fn shift_remove_found_impl<Sz>(&mut self, probe: usize, found: usize) -> (K, V)-    where-        Sz: Size,-    {-        // index `probe` and entry `found` is to be removed+    /// Remove an entry by shifting all entries that follow it+    fn shift_remove_bucket(&mut self, raw_bucket: RawBucket) -> (usize, K, V) {         // use Vec::remove, but then we need to update the indices that point         // to all of the other entries that have to move-        self.indices[probe] = Pos::none();-        let entry = self.entries.remove(found);+        let index = unsafe {+            self.indices.erase_no_drop(&raw_bucket);+            raw_bucket.read()+        };+        let entry = self.entries.remove(index);          // correct indices that point to the entries that followed the removed entry.-        // use a heuristic between a full sweep vs. a `probe_loop!` for every shifted item.-        if self.indices.len() < (self.entries.len() - found) * 2 {-            // shift all indices greater than `found`-            for pos in self.indices.iter_mut() {-                if let Some((i, _)) = pos.resolve::<Sz>() {-                    if i > found {-                        // shift the index-                        pos.set_pos::<Sz>(i - 1);+        // use a heuristic between a full sweep vs. a `find()` for every shifted item.+        let raw_capacity = self.indices.buckets();+        let shifted_entries = &self.entries[index..];+        if shifted_entries.len() > raw_capacity / 2 {+            // shift all indices greater than `index`+            unsafe {+                for raw_bucket in self.indices.iter() {

Ok, so the reason RawTable::iter is unsafe is because it has no lifetime parameter, it's a proxy for a bunch of raw pointers. However as long as we have the right borrowing behavior, there are no other traps or requirements on calling next, IIUC.

cuviper

comment created time in 2 months

Pull request review commentbluss/indexmap

Switch to hashbrown's RawTable internally

 where     where         Q: Hash + Equivalent<K>,     {-        if let Some((_, found)) = self.find(key) {-            Some(found)-        } else {+        if self.is_empty() {

Why check for empty here?

cuviper

comment created time in 2 months

Pull request review commentbluss/indexmap

Switch to hashbrown's RawTable internally

 impl<K, V> Entries for IndexMapCore<K, V> {     where         F: FnOnce(&mut [Self::Entry]),     {-        let side_index = self.save_hash_index();         f(&mut self.entries);-        self.restore_hash_index(side_index);+        self.rebuild_hash_table();     } }  impl<K, V> IndexMapCore<K, V> {     #[inline]     pub(crate) fn new() -> Self {         IndexMapCore {-            mask: 0,-            indices: Box::new([]),+            indices: RawTable::new(),             entries: Vec::new(),         }     }      #[inline]     pub(crate) fn with_capacity(n: usize) -> Self {-        let raw = to_raw_capacity(n);-        let raw_cap = max(raw.next_power_of_two(), 8);         IndexMapCore {-            mask: raw_cap.wrapping_sub(1),-            indices: vec![Pos::none(); raw_cap].into_boxed_slice(),-            entries: Vec::with_capacity(usable_capacity(raw_cap)),+            indices: RawTable::with_capacity(n),+            entries: Vec::with_capacity(n),         }     } -    // Return whether we need 32 or 64 bits to specify a bucket or entry index-    #[cfg(not(feature = "test_low_transition_point"))]-    fn size_class_is_64bit(&self) -> bool {-        usize::max_value() > u32::max_value() as usize-            && self.raw_capacity() >= u32::max_value() as usize-    }--    // for testing-    #[cfg(feature = "test_low_transition_point")]-    fn size_class_is_64bit(&self) -> bool {-        self.raw_capacity() >= 64-    }--    #[inline(always)]-    fn raw_capacity(&self) -> usize {-        self.indices.len()-    }-+    #[inline]     pub(crate) fn len(&self) -> usize {-        self.entries.len()+        self.indices.len()     } +    #[inline]     pub(crate) fn capacity(&self) -> usize {-        usable_capacity(self.raw_capacity())+        cmp::min(self.indices.capacity(), self.entries.capacity())     }      pub(crate) fn clear(&mut self) {+        self.indices.clear_no_drop();         self.entries.clear();-        self.clear_indices();     }      pub(crate) fn drain(&mut self, range: RangeFull) -> Drain<Bucket<K, V>> {-        self.clear_indices();+        self.indices.clear_no_drop();         self.entries.drain(range)     } -    // clear self.indices to the same state as "no elements"-    fn clear_indices(&mut self) {-        for pos in self.indices.iter_mut() {-            *pos = Pos::none();-        }+    /// Reserve capacity for `additional` more key-value pairs.+    pub(crate) fn reserve(&mut self, additional: usize) {+        self.indices.reserve(additional, get_hash(&self.entries));+        self.reserve_entries();     } -    fn first_allocation(&mut self) {-        debug_assert_eq!(self.len(), 0);-        let raw_cap = 8usize;-        self.mask = raw_cap.wrapping_sub(1);-        self.indices = vec![Pos::none(); raw_cap].into_boxed_slice();-        self.entries = Vec::with_capacity(usable_capacity(raw_cap));+    /// Reserve entries capacity to match the indices+    fn reserve_entries(&mut self) {+        let additional = self.indices.capacity() - self.entries.len();+        self.entries.reserve_exact(additional);     } -    pub(crate) fn reserve_one(&mut self) {-        if self.len() == self.capacity() {-            dispatch_32_vs_64!(self.double_capacity());-        }+    /// Shrink the capacity of the map as much as possible.+    pub(crate) fn shrink_to_fit(&mut self) {+        self.indices.shrink_to(0, get_hash(&self.entries));+        self.entries.shrink_to_fit();     } -    #[inline(never)]-    // `Sz` is *current* Size class, before grow-    fn double_capacity<Sz>(&mut self)-    where-        Sz: Size,-    {-        debug_assert!(self.raw_capacity() == 0 || self.len() > 0);-        if self.raw_capacity() == 0 {-            return self.first_allocation();-        }--        // find first ideally placed element -- start of cluster-        let mut first_ideal = 0;-        for (i, index) in enumerate(&*self.indices) {-            if let Some(pos) = index.pos() {-                if 0 == probe_distance(self.mask, self.entries[pos].hash, i) {-                    first_ideal = i;-                    break;-                }-            }-        }--        // visit the entries in an order where we can simply reinsert them-        // into self.indices without any bucket stealing.-        let new_raw_cap = self.indices.len() * 2;-        let old_indices = replace(-            &mut self.indices,-            vec![Pos::none(); new_raw_cap].into_boxed_slice(),-        );-        self.mask = new_raw_cap.wrapping_sub(1);--        // `Sz` is the old size class, and either u32 or u64 is the new-        for &pos in &old_indices[first_ideal..] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));-        }--        for &pos in &old_indices[..first_ideal] {-            dispatch_32_vs_64!(self.reinsert_entry_in_order::<Sz>(pos));+    /// Remove the last key-value pair+    pub(crate) fn pop(&mut self) -> Option<(K, V)> {+        if let Some(entry) = self.entries.pop() {+            let last = self.entries.len();+            let raw_bucket = self.find_index(entry.hash, last).unwrap();+            unsafe { self.indices.erase_no_drop(&raw_bucket) };+            Some((entry.key, entry.value))+        } else {+            None         }-        let more = self.capacity() - self.len();-        self.entries.reserve_exact(more);     } -    // write to self.indices-    // read from self.entries at `pos`-    //-    // reinserting rewrites all `Pos` entries anyway. This handles transitioning-    // from u32 to u64 size class if needed by using the two type parameters.-    fn reinsert_entry_in_order<SzNew, SzOld>(&mut self, pos: Pos)-    where-        SzNew: Size,-        SzOld: Size,-    {-        if let Some((i, hash_proxy)) = pos.resolve::<SzOld>() {-            // only if the size class is conserved can we use the short hash-            let entry_hash = if SzOld::is_same_size::<SzNew>() {-                hash_proxy.get_short_hash(&self.entries, i).into_hash()-            } else {-                self.entries[i].hash-            };-            // find first empty bucket and insert there-            let mut probe = desired_pos(self.mask, entry_hash);-            probe_loop!(probe < self.indices.len(), {-                if self.indices[probe].is_none() {-                    // empty bucket, insert here-                    self.indices[probe] = Pos::with_hash::<SzNew>(i, entry_hash);-                    return;-                }-            });+    /// Append a key-value pair, *without* checking whether it already exists.+    fn push(&mut self, hash: HashValue, key: K, value: V) -> usize {

Doc could explain the return value

cuviper

comment created time in 2 months

Pull request review commentbluss/indexmap

Switch to hashbrown's RawTable internally

 autocfg = "1" serde = { version = "1.0", optional = true, default-features = false } rayon = { version = "1.0", optional = true } +[dependencies.hashbrown]+version = "0.8"+default-features = false+features = ["inline-more", "raw"]

rustc doesn't want to use inline-more, so it seems like we shouldn't

cuviper

comment created time in 2 months

pull request commentbluss/indexmap

Switch to hashbrown's RawTable internally

This weekend should have some time where I can review

cuviper

comment created time in 2 months

startedstjepang/fastrand

started time in 2 months

pull request commentrust-ndarray/ndarray

Add WindowsMut to complement Windows

Maybe with ArrayView of Cells, we can get there. It's on the way (and the cast method for raw views is already there, to enable others to experiment)

hanmertens

comment created time in 2 months

startedOthers/shredder

started time in 2 months

Pull request review commentbluss/indexmap

Switch to hashbrown's RawTable internally

 where      /// ***Panics*** if `key` is not present in the map.     fn index(&self, key: &'a Q) -> &V {-        if let Some(v) = self.get(key) {-            v-        } else {-            panic!("IndexMap: key not found")-        }+        self.get(key).expect("IndexMap: key not found")

Sounds fine actually. Shame on me, I didn't know that expect doesn't modify the message, and now with track caller, we don't change the location info either with this change. Using except seems good.

cuviper

comment created time in 2 months

Pull request review commentbluss/indexmap

Switch to hashbrown's RawTable internally

 where     ///     /// Computes in **O(n log n + c)** time and **O(n)** space where *n* is     /// the length of the map and *c* the capacity. The sort is stable.-    pub fn sort_by<F>(&mut self, compare: F)+    pub fn sort_by<F>(&mut self, mut cmp: F)     where         F: FnMut(&K, &V, &K, &V) -> Ordering,     {-        self.core.sort_by(compare)+        self.with_entries(|entries| {+            entries.sort_by(|a, b| cmp(&a.key, &a.value, &b.key, &b.value));+        });

This just appeared in the diff, sweep is not necessary

cuviper

comment created time in 2 months

pull request commentbluss/indexmap

Use a consistently seeded Rng for benchmark stability

Thanks, good catch.

cuviper

comment created time in 2 months

pull request commentbluss/indexmap

Switch to hashbrown's RawTable internally

Really cool work.

So we lose the 32-bit index optimization, and still have some performance improvements, does that mean that there is more to gain?

We are giving up being implemented in safe Rust and can only do so if we show performance improvements that are above the noise. Some of the lookup benchmark cases do that, and the insert cases barely do it. The improved lookup benches are really encouraging.

I'll read RawTable and then come back to reviewing. Are you going to start experimenting with this version of indexmap in rustc, before it gets merged?

cuviper

comment created time in 2 months

Pull request review commentbluss/indexmap

Switch to hashbrown's RawTable internally

+#![allow(unsafe_code)]

We need to add a comment to the top-level deny we have for this - to say that we have some sections where it is allowed in the core of the crate.

cuviper

comment created time in 2 months

Pull request review commentbluss/indexmap

Switch to hashbrown's RawTable internally

 where      /// ***Panics*** if `key` is not present in the map.     fn index(&self, key: &'a Q) -> &V {-        if let Some(v) = self.get(key) {-            v-        } else {-            panic!("IndexMap: key not found")-        }+        self.get(key).expect("IndexMap: key not found")

Style-wise I'm not sure I agree that Option::expect is a good way for a library to report a panic, I prefer the panic as it was

cuviper

comment created time in 2 months

Pull request review commentbluss/indexmap

Switch to hashbrown's RawTable internally

 where     ///     /// Computes in **O(n log n + c)** time and **O(n)** space where *n* is     /// the length of the map and *c* the capacity. The sort is stable.-    pub fn sort_by<F>(&mut self, compare: F)+    pub fn sort_by<F>(&mut self, mut cmp: F)     where         F: FnMut(&K, &V, &K, &V) -> Ordering,     {-        self.core.sort_by(compare)+        self.with_entries(|entries| {+            entries.sort_by(|a, b| cmp(&a.key, &a.value, &b.key, &b.value));+        });

style-wise I'd put move on both these closures so that they capture the closure by value; just something we've seen to optimize better in some cases

cuviper

comment created time in 2 months

pull request commentbluss/indexmap

Adding `reverse()` to `IndexMap` & `IndexSet`

Special thanks for using quickcheck

linclelinkpart5

comment created time in 2 months

Pull request review commentbluss/indexmap

Unify the entry and insert code

 impl<K, V> OrderMapCore<K, V> {     } } +trait ProbeAction<'a, Sz: Size, K, V>: Sized {+    type Output;+    // handle an occupied spot in the map+    fn hit(self, entry: OccupiedEntry<'a, K, V>) -> Self::Output;+    // handle an empty spot in the map+    fn empty(self, entry: VacantEntry<'a, K, V>) -> Self::Output;+    // robin hood: handle a spot that you should steal because it's better for you+    fn steal(self, entry: VacantEntry<'a, K, V>) -> Self::Output;+}++struct InsertValue<V>(V);++impl<'a, Sz: Size, K, V> ProbeAction<'a, Sz, K, V> for InsertValue<V> {+    type Output = (usize, Option<V>);++    fn hit(self, entry: OccupiedEntry<'a, K, V>) -> Self::Output {+        let old = replace(&mut entry.map.entries[entry.index].value, self.0);+        (entry.index, Some(old))+    }++    fn empty(self, entry: VacantEntry<'a, K, V>) -> Self::Output {+        let pos = &mut entry.map.indices[entry.probe];+        let index = entry.map.entries.len();+        *pos = Pos::with_hash::<Sz>(index, entry.hash);+        entry.map.entries.push(Bucket {+            hash: entry.hash,+            key: entry.key,+            value: self.0,+        });+        (index, None)

This method body seems "loose" like this - could it not just call a method on the entry? I general, it's hard to follow the insert logic when it's spread out.

(This comment is rather perfectionist, and not something I wanted to make a requirement to fix before merging).

mwillsey

comment created time in 2 months

Pull request review commentbluss/indexmap

Unify the entry and insert code

 impl<K, V> OrderMapCore<K, V> {     } } +trait ProbeAction<'a, Sz: Size, K, V>: Sized {+    type Output;+    // handle an occupied spot in the map+    fn hit(self, entry: OccupiedEntry<'a, K, V>) -> Self::Output;+    // handle an empty spot in the map+    fn empty(self, entry: VacantEntry<'a, K, V>) -> Self::Output;+    // robin hood: handle a spot that you should steal because it's better for you+    fn steal(self, entry: VacantEntry<'a, K, V>) -> Self::Output;+}++struct InsertValue<V>(V);++impl<'a, Sz: Size, K, V> ProbeAction<'a, Sz, K, V> for InsertValue<V> {+    type Output = (usize, Option<V>);++    fn hit(self, entry: OccupiedEntry<'a, K, V>) -> Self::Output {+        let old = replace(&mut entry.map.entries[entry.index].value, self.0);+        (entry.index, Some(old))+    }++    fn empty(self, entry: VacantEntry<'a, K, V>) -> Self::Output {+        let pos = &mut entry.map.indices[entry.probe];+        let index = entry.map.entries.len();+        *pos = Pos::with_hash::<Sz>(index, entry.hash);+        entry.map.entries.push(Bucket {+            hash: entry.hash,+            key: entry.key,+            value: self.0,+        });+        (index, None)+    }++    fn steal(self, entry: VacantEntry<'a, K, V>) -> Self::Output {+        let index = entry.map.entries.len();+        entry.insert_impl::<Sz>(self.0);+        (index, None)+    }+}++struct MakeEntry;++impl<'a, Sz: Size, K: 'a, V: 'a> ProbeAction<'a, Sz, K, V> for MakeEntry {+    type Output = Entry<'a, K, V>;+    fn hit(self, entry: OccupiedEntry<'a, K, V>) -> Self::Output {+        Entry::Occupied(entry)+    }+    fn empty(self, entry: VacantEntry<'a, K, V>) -> Self::Output {

nitpick missing empty line between the methods

mwillsey

comment created time in 2 months

pull request commentbluss/indexmap

Unify the entry and insert code

I'll add it as comments. I thought I might as well merge and edit myself, sometimes I prefer that to the back-and-forth, just don't want reviews to be too perfectionist

mwillsey

comment created time in 2 months

pull request commentbluss/indexmap

Adding `reverse()` to `IndexMap` & `IndexSet`

Please update the PR description to close the feature request with a keyword https://help.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword thanks

linclelinkpart5

comment created time in 2 months

issue commentrust-ndarray/ndarray

API for panic-safe slicing

There isn't an API for that

nagisa

comment created time in 2 months

pull request commentbluss/indexmap

Unify the entry and insert code

Thanks. I'll merge this when I have time to make some post-merge edits

mwillsey

comment created time in 2 months

Pull request review commentrust-lang/rust

Free `default()` forwarding to `Default::default()`

+# `default_free_fn`++The tracking issue for this feature is: [#73014]++[#73014]: https://github.com/rust-lang/rust/issues/73014++------------------------++Adds a free `default()` function to the `std::default` module.  This function+just forwards to [`Default::default()`], but may remove repetition of the word+"default from the call site.

There's an unmatched " here

ilya-bobyr

comment created time in 2 months

pull request commentbluss/indexmap

Refactor the map core to its own module

Before we get too far on hashbrown, are you OK with the changes in this PR alone?

Yes, sure. Nothing seems remotely controversial about it, unless I'm missing something (?), so it's just a very nice cleanup.

I'm somewhat inclined to give this a lot of weight, myself, even if some performance is lost.

Of course, as long as we have some tangible gains, like performance, for indexmap users, too.

cuviper

comment created time in 2 months

pull request commentbluss/indexmap

Unify the entry and insert code

Could you update the PR description to use one of the formulations that means that #108 is closed when this is merged? :) https://help.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword

mwillsey

comment created time in 2 months

Pull request review commentbluss/indexmap

Unify the entry and insert code

 impl<K, V> OrderMapCore<K, V> {     } } +trait ProbeAction<'a, Sz: Size, K, V>: Sized {+    type Output;+    fn hit(self, entry: OccupiedEntry<'a, K, V>) -> Self::Output;+    fn empty(self, entry: VacantEntry<'a, K, V>) -> Self::Output;+    fn steal(self, entry: VacantEntry<'a, K, V>) -> Self::Output {+        self.empty(entry)+    }+}++struct InsertValue<V>(V);++impl<'a, Sz: Size, K, V> ProbeAction<'a, Sz, K, V> for InsertValue<V> {+    type Output = (usize, Option<V>);+    fn hit(self, entry: OccupiedEntry<'a, K, V>) -> Self::Output {+        let old = replace(&mut entry.map.entries[entry.index].value, self.0);+        (entry.index, Some(old))+    }+    fn empty(self, entry: VacantEntry<'a, K, V>) -> Self::Output {+        let pos = &mut entry.map.indices[entry.probe];+        let index = entry.map.entries.len();+        *pos = Pos::with_hash::<Sz>(index, entry.hash);+        entry.map.entries.push(Bucket {+            hash: entry.hash,+            key: entry.key,+            value: self.0,+        });+        (index, None)+    }+    fn steal(self, entry: VacantEntry<'a, K, V>) -> Self::Output {+        let index = entry.map.entries.len();

Are we then maybe identifying the main thing that slows down entry compared with the old insert? Maybe if we updated entry's VacantEntry to skip that loop, for the non-steal case, that insert and entry would match each other in performance?

mwillsey

comment created time in 2 months

pull request commentbluss/indexmap

Refactor the map core to its own module

I guess that we unfortunately need clarification on the "experimental"-ness of the RawTable, if we depend on it for indexmap 1.x, even though it doesn't affect our API surface.

cuviper

comment created time in 2 months

pull request commentbluss/indexmap

Refactor the map core to its own module

Really cool that it can be done, nice job!!, if the benchmark gains are significant it seems like we should use hashbrown no doubt. It also resolves our old question.. should we use unsafe to speed up indexmap? This way we outsource to a well-known and well tested implementation.

cuviper

comment created time in 2 months

CommitCommentEvent

Pull request review commentrust-lang/rust

Free `default()` forwarding to `Default::default()`

 pub trait Default: Sized {     fn default() -> Self; } +/// Equivalent to [`Default::default()`] but without repeating the word+/// "default" twice at the call site.

A proposal would be to word this in a positive way, i.e describe what this function does well, instead of describing what its alternative does less well, maybe something like this:

Return the default value of a type according to the Default trait.

The type to return is inferred from context; this is a shorter but equivalent way of calling Default::default().

ilya-bobyr

comment created time in 2 months

Pull request review commentrust-lang/rust

Free `default()` forwarding to `Default::default()`

 pub trait Default: Sized {     fn default() -> Self; } +/// Equivalent to [`Default::default()`] but without repeating the word+/// "default" twice at the call site.+///+/// For example:+/// ```+/// #![feature(default_free_fn)]+///+/// use std::default::default;+///+/// #[derive(Default)]+/// struct AppConfig {+///     foo: FooConfig,+///     bar: BarConfig,+/// }+///+/// #[derive(Default)]+/// struct FooConfig {+///     foo: i32,+/// }+///+/// #[derive(Default)]+/// struct BarConfig {+///     bar: f32,+///     baz: u8,+/// }+///+/// fn main() {+///     let options = AppConfig {+///         foo: default(),+///         bar: BarConfig {+///             bar: 10.1,+///             ..default()+///         },+///     };+/// }+/// ```+#[unstable(feature = "default_free_fn", issue = "73014")]+#[inline]+pub fn default<T: Default>() -> T {+    Default::default()

I guess I'm here because avoiding Default::default is a pet peeve of mine, which in some way I must partly share with ilya-bobyr then. Since we say equivalent, any equivalent code would be fine, and I think we should prefer the style that we would like to see in Rust codebases around the ecosystem, hence T::default().

If the docs say Default::default, maybe that could change now or in the future.

ilya-bobyr

comment created time in 2 months

Pull request review commentbluss/indexmap

Unify the entry and insert code

 impl<K, V> OrderMapCore<K, V> {     } } +trait ProbeAction<'a, Sz: Size, K, V>: Sized {+    type Output;+    fn hit(self, entry: OccupiedEntry<'a, K, V>) -> Self::Output;+    fn empty(self, entry: VacantEntry<'a, K, V>) -> Self::Output;+    fn steal(self, entry: VacantEntry<'a, K, V>) -> Self::Output {+        self.empty(entry)+    }

For code readability we can prefer to not have this default impl for steal.

mwillsey

comment created time in 2 months

Pull request review commentbluss/indexmap

Unify the entry and insert code

 where     K: Hash + Eq,     S: BuildHasher, {-    // FIXME: reduce duplication (compare with insert)-    fn entry_phase_1<Sz>(&mut self, key: K) -> Entry<K, V>+    fn probe_action<'a, Sz, A>(&'a mut self, key: K, action: A) -> A::Output

Nice refactoring. I think this method needs a clearer name. We use "probe" in every loop when reading/writing to the map, so it is not specific enough. Even "insert phase 1" is an ok name - and insert_phase_2 still exists, so it needs to be renamed in tandem, if we entirely remove anything named "phase 1".

mwillsey

comment created time in 2 months

Pull request review commentbluss/indexmap

Unify the entry and insert code

 impl<K, V> OrderMapCore<K, V> {     } } +trait ProbeAction<'a, Sz: Size, K, V>: Sized {+    type Output;+    fn hit(self, entry: OccupiedEntry<'a, K, V>) -> Self::Output;+    fn empty(self, entry: VacantEntry<'a, K, V>) -> Self::Output;+    fn steal(self, entry: VacantEntry<'a, K, V>) -> Self::Output {+        self.empty(entry)+    }+}++struct InsertValue<V>(V);++impl<'a, Sz: Size, K, V> ProbeAction<'a, Sz, K, V> for InsertValue<V> {+    type Output = (usize, Option<V>);+    fn hit(self, entry: OccupiedEntry<'a, K, V>) -> Self::Output {+        let old = replace(&mut entry.map.entries[entry.index].value, self.0);+        (entry.index, Some(old))+    }+    fn empty(self, entry: VacantEntry<'a, K, V>) -> Self::Output {

A style nitpick is that all the methods need an empty line separating them from the item before them

mwillsey

comment created time in 2 months

Pull request review commentbluss/indexmap

Unify the entry and insert code

 impl<K, V> OrderMapCore<K, V> {     } } +trait ProbeAction<'a, Sz: Size, K, V>: Sized {+    type Output;+    fn hit(self, entry: OccupiedEntry<'a, K, V>) -> Self::Output;

If possible, document each method with a one-line summary of what it represents

mwillsey

comment created time in 2 months

pull request commentbluss/indexmap

Unify the entry and insert code

That's awesome

mwillsey

comment created time in 2 months

Pull request review commentbluss/indexmap

Unify the entry and insert code

 impl<K, V> OrderMapCore<K, V> {     } } +trait ProbeAction<'a, Sz: Size, K, V>: Sized {+    type Output;+    fn hit(self, entry: OccupiedEntry<'a, K, V>) -> Self::Output;+    fn empty(self, entry: VacantEntry<'a, K, V>) -> Self::Output;+    fn steal(self, entry: VacantEntry<'a, K, V>) -> Self::Output {+        self.empty(entry)+    }+}++struct InsertValue<V>(V);++impl<'a, Sz: Size, K, V> ProbeAction<'a, Sz, K, V> for InsertValue<V> {+    type Output = (usize, Option<V>);+    fn hit(self, entry: OccupiedEntry<'a, K, V>) -> Self::Output {+        let old = replace(&mut entry.map.entries[entry.index].value, self.0);+        (entry.index, Some(old))+    }+    fn empty(self, entry: VacantEntry<'a, K, V>) -> Self::Output {+        let pos = &mut entry.map.indices[entry.probe];+        let index = entry.map.entries.len();+        *pos = Pos::with_hash::<Sz>(index, entry.hash);+        entry.map.entries.push(Bucket {+            hash: entry.hash,+            key: entry.key,+            value: self.0,+        });+        (index, None)+    }+    fn steal(self, entry: VacantEntry<'a, K, V>) -> Self::Output {+        let index = entry.map.entries.len();

this could be just entry.insert_impl::<Sz>(self.0) right? Something like that to reduce duplication

mwillsey

comment created time in 2 months

Pull request review commentrust-lang/rust

Free `default()` forwarding to `Default::default()`

 pub trait Default: Sized {     fn default() -> Self; } +/// Equivalent to [`Default::default()`] but without repeating the word+/// "default" twice at the call site.+///+/// For example:+/// ```+/// #![feature(default_free_fn)]+///+/// use std::default::default;+///+/// #[derive(Default)]+/// struct AppConfig {+///     foo: FooConfig,+///     bar: BarConfig,+/// }+///+/// #[derive(Default)]+/// struct FooConfig {+///     foo: i32,+/// }+///+/// #[derive(Default)]+/// struct BarConfig {+///     bar: f32,+///     baz: u8,+/// }+///+/// fn main() {+///     let options = AppConfig {+///         foo: default(),+///         bar: BarConfig {+///             bar: 10.1,+///             ..default()+///         },+///     };+/// }+/// ```+#[unstable(feature = "default_free_fn", issue = "73014")]+#[inline]+pub fn default<T: Default>() -> T {+    Default::default()

This is a nitpick I'd mention in any PR, but I guess it's extra strange now - here, why don't we prefer using the style T::default()?

ilya-bobyr

comment created time in 2 months

pull request commentrust-lang/rust

Free `default()` forwarding to `Default::default()`

Since there is no RFC I will put in a couple of questions here.

We have the following existing alternatives:

  1. T::default() if the type is known or partially known. Note that we can use path-like syntax PathLike::default() and type syntax <Type>::default().
  2. <_>::deafult() as a special case of the type syntax when the type is to be inferred. This works the same way as the free free function in this PR.

The existing syntaxes have the following benefit. Using an explicit type makes the code easy to read, when the explicit type is the "source" of the type, this becomes the best option. Even when the type would be redundant, it can, as a matter of taste, still be good code style. Instead of let x: T = Default::default(), prefer let x = T::default() instead.

The <Type>::default() syntax allows partial inference, for example like <HashMap<_, _>>::default() (pick a hashmap with the default for the hasher parameter).

The <_>::default() syntax has the benefit that it allows full type inference, and that it works as long as the Default trait is in scope. So it has power equal to the free function being added here.

The drawback would be if you don't think this is good-looking or easy to understand code.

ilya-bobyr

comment created time in 2 months

issue commentrust-lang/rust

Tracking issue for std::default::default()

You call out Default::default() in the description here - but why aren't T::default() or <$any_type>::default() good alternatives? They work today, if the Default trait is in scope, and it is in the prelude.

ilya-bobyr

comment created time in 2 months

issue commentrust-lang/rust

Whats wrong with turkey ?

Does this feature already exist in a crate in the ecosystem, and what shape of the API have they settled on? I can't see it in unic, but it might exist.

OzqurYalcin

comment created time in 2 months

issue commentbluss/indexmap

`indexmap` is rebuilt on every `cargo` run

Thanks for providing so much good information in the bug report.

Dfinity-Alin

comment created time in 2 months

issue commentrust-lang/rust

NonNull methods for pointer offset, add, sub

I mean non-null-ness yes.

Non-null-ness is not a concern as long as we stick to the rules we have, which are partly based on requirements from the llvm backend. The whole paragraph tries to look beyond just LLVM

bluss

comment created time in 2 months

created tagbluss/indexmap

tag1.4.0

A hash table with consistent order and fast iteration; access items by key or sequence index

created time in 2 months

push eventbluss/indexmap

bluss

commit sha da1835123cc489eeaa7a83da961c0031e17adc79

1.4.0

view details

push time in 2 months

pull request commentbluss/indexmap

Use a plain relative path for autocfg::rerun_path

I'll make a point release

cuviper

comment created time in 2 months

push eventbluss/indexmap

Josh Stone

commit sha b6c596319bd249a7337879334bf57ce2580cdfa0

Use a plain relative path for autocfg::rerun_path

view details

bluss

commit sha 6e929fa72822e3abe5421fcf5bb1180cac271109

Merge pull request #124 from cuviper/rerun_path Use a plain relative path for autocfg::rerun_path

view details

push time in 2 months

issue closedbluss/indexmap

`indexmap` is rebuilt on every `cargo` run

From what I can tell, the changes in #106 caused one of our CI jobs' run times to shoot up from ~23 minutes (while depending on indexmap 1.1.0) to 1 1/2 - 2 hours (about 4x, while depending on indexmap 1.3.2).

Due to the shortcomings of cargo we need to run it 4 times in succession to cover all bases:

$ cargo build --release --all-targets      # Build production code
$ cargo test --no-run --release            # Build tests and doc tests
$ cargo test --no-run --release --benches  # Build benchmarks
$ cargo test --release                     # Run tests and doc tests

Normally this is not an issue, as the cargo build at the top usually builds everything, the next two are no-ops, and the last one again doesn't need to build anything, only run the tests. But after bumping indexmap to 1.3.2 in Cargo.lock, suddenly each of those cargo commands first starts rebuilding indexmap and then everything that depends on it, essentially doing the exact same thing 4 times.

AFAICT that is because of the use of autocfg in build.rs. Every time cargo is run, it thinks indexmap has been modified and rebuilds it. Which in turn causes everything that depends on it to be rebuilt. Although nothing at all has changed.

I don't have a solution and I've already spent way too much time tracking this down and finding a workaround (rolled back indexmap to 1.2.0), but it would be great if you could find a way to convince cargo that nothing has changed and it doesn't need to rebuild indexmap every single time.

(BTW, this is not only an issue for CI, any local build or test run will also start by rebuilding indexmap and go from there.)

closed time in 2 months

Dfinity-Alin

PR merged bluss/indexmap

Use a plain relative path for autocfg::rerun_path

Fixes #123.

+1 -1

1 comment

1 changed file

cuviper

pr closed time in 2 months

pull request commentbluss/indexmap

Use a plain relative path for autocfg::rerun_path

Thanks!

cuviper

comment created time in 2 months

Pull request review commentpetgraph/fixedbitset

Implement custom behavior for PartialEq, Eq, PartialOrd, Ord and Hash

 impl <'a> BitXorAssign for FixedBitSet     } } +/// Two `FixedBitSet`s are equal if and only if they have the exact same set of "one" elements+impl PartialEq for FixedBitSet+{+    #[inline]+    fn eq(&self, other: &Self) -> bool {+        self.ones().eq(other.ones())

While conceptually this algorithm for comparison might be good, I'm afraid that it is slow. Comparison should be based on comparing the blocks, I would guess.

sitegui

comment created time in 2 months

startedgerben-s/quicksort-blog-post

started time in 2 months

issue commentpetgraph/fixedbitset

Owned iterator: into_ones()

I think it makes sense. My review time is a bit limited, but hopefully it can be solved

sitegui

comment created time in 2 months

issue commentbluss/either

set serde derive to use serde untagged

Ok, it will certainly not be a feature for the reasons clarified in #22.

meltinglava

comment created time in 2 months

delete branch rust-ndarray/ndarray

delete branch : par-collect

delete time in 2 months

delete branch rust-ndarray/ndarray

delete branch : parallel-array-view-1d

delete time in 2 months

PR closed rust-ndarray/ndarray

Implement IndexedParallelIterator for one-dimensional arrays

The rayon IndexedParallelIterator trait maps well to any one-dimensional indexed sequence, and we can implement this trait as a special case (of the general unindexed parallel iterator of an arbitrary array).

Implementing this is logical, it fits, but we still will recommend ndarray's Zip for lock step iteration. The reason is performance. Zip will do it better.

+102 -7

1 comment

4 changed files

bluss

pr closed time in 2 months

pull request commentrust-ndarray/ndarray

Implement IndexedParallelIterator for one-dimensional arrays

These implementations logically fit, but the performance of the enabled methods is not good - so not having these is a way to push users towards using ndarray::Zip for zipping and unordered parallel iterator methods anyway, which perform many times better.

bluss

comment created time in 2 months

push eventrust-ndarray/ndarray

bluss

commit sha ad856bd06f995f097ef8fe473c09a296c79616ab

FEAT: Special case D::from_dimension for Ix1

view details

bluss

commit sha 511f8b2cc97bb12ce75b8d94945cef5ed80dfad6

FIX: Use Zip::fold_while for final reduction in parallel array view When fold_with is used, use Zip::fold_while to fold the array view's parallel iterator. Note that in some cases, the IntoIterator of the view is used instead.

view details

bluss

commit sha 192a166e479cb68bb480a7126fd1233722f73c5c

FIX: Make is_contiguous pub(crate) The reason this method is not yet public, is that it's not accurate (false negatives) for less common layouts. It's correct for C/F i.e row/col major layouts.

view details

bluss

commit sha 35e89f878adcaa4899b567d8deccbe6e426c11a2

TEST: Add benchmarks for parallel collect

view details

bluss

commit sha 84295c4e893f2ee7e1cd4d95f2ef46bfde9ce0f8

FEAT: Factor out traits SplitAt and SplitPreference To be used by Zip and parallel Zip

view details

bluss

commit sha 3ab67eb062d5317edfe35ced0d5e24022c6fe2fa

FEAT: Add ParalleIterator ParallelSplits This iterator is for internal use; it produces the splits of a Zip (it splits the Zip the same way as the regular parallel iterator for Zip, but here the whole Zip is the produced item of the iterator.) This is helpful as a building block for other operations.

view details

bluss

commit sha ee97dbfbaf42b3a21d92384f414ec20fce0617dd

TEST: Add parallel collect test for small arrays

view details

bluss

commit sha fe2ebf6eb96a191231751ebae65e76ab94476e76

FEAT: Add internal Zip::last_producer This method is useful for parallel Zip.

view details

bluss

commit sha 42772962313e550904b027fa233ceda45f1a3214

FEAT: Implement generic parallel collect Allow non-copy elements by implementing dropping partial results from collect (needed if there is a panic with unwinding during the apply-collect process). It is implemented by: 1. allocate an uninit output array of the right size and layout 2. use parallelsplits to split the Zip into chunks processed in parallel 3. for each chunk keep track of the slice of written elements 4. each output chunk is contiguous due to the layout being picked to match the Zip's preferred layout 5. Use reduce to merge adjacent partial results; this ensures we drop all the rests correctly, if there is a panic in any thread

view details

bluss

commit sha 6d43c133db81f7c32bfd2b20a9ea9d92513d07cf

FIX: In ParallelSplits, count maximum number of splits Instead of requiring to use the size in elements of the thing-to-split, simply use a counter for the number of splits.

view details

bluss

commit sha e3ebf8c545ba7e5d59ef0da04ae8c909a90b1166

FIX: In Zip, fix unused code warning for `last_of` macro

view details

bluss

commit sha efcd6074324c18ef21cd2a0ce10c89b2732aef13

FIX: Make Partial::new an unsafe method

view details

bluss

commit sha 8ed9ac331ebf11f85d18ed963b297ff033bf35c8

FIX: Wrap long lines in impl_par_methods.rs

view details

bluss

commit sha d02b757aca316f96902ebd63ed4d0a71a51acfb3

FIX: Use Partial instead of PartialArray Partial is just a contiguous slice, and much simpler than PartialArray; Partial is all that's needed, because the area written will always be contiguous.

view details

bluss

commit sha e47261294d79d40b21c4690a457e5b2c7dc4049b

FEAT: Combine common parts of apply_collect and par_apply_collect Factor out the common part of the parallel and and regular apply_collect implementation; the non-parallel part came first and ended up more complicated originally. With the parallel version in place, both can use the same main operation which is implemented in the methods Zip::collect_with_partial.

view details

bluss

commit sha f69248e977f42b7afb01f5c280baba1da09738c9

Merge pull request #817 from rust-ndarray/par-collect Implement parallel collect to array for non-Copy elements

view details

push time in 2 months

PR merged rust-ndarray/ndarray

Implement parallel collect to array for non-Copy elements

Allow non-copy elements by implementing dropping partial results from collect (needed if there is a panic with unwinding during the apply-collect process).

It is implemented by:

  1. allocate an uninit output array of the right size and layout
  2. use parallelsplits to split the Zip into chunks processed in parallel
  3. for each chunk keep track of the slice of written elements
  4. each output chunk is contiguous due to the layout being picked to match the Zip's preferred layout
  5. Use reduce to merge adjacent partial results; this ensures we drop all the rests correctly, if there is a panic in any thread
+478 -189

2 comments

14 changed files

bluss

pr closed time in 2 months

more