profile
viewpoint
If you are wondering where the data of this site comes from, please visit https://api.github.com/users/hfp/events. GitMemory does not store any data, but only uses NGINX to cache data for a period of time. The idea behind GitMemory is simply to give users a better reading experience.
Hans Pabst hfp Intel Extreme Computing Zurich / Switzerland https://www.linkedin.com/in/pabst Application engineer in Intel's Extreme Computing Group enabling scientific applications to take advantage of current and future hardware.

hfp/libxsmm 583

Library for specialized dense and sparse matrix operations, and deep learning primitives.

hfp/xconfigure 41

High-Performance configuration patterns and recipes.

hfp/libxstream 16

Library to program with streams, events, and to queue own functions into a stream.

benoitsteiner/tensorflow-xsmm 14

Improved performance for TensorFlow on Intel hardware.

cdahnken/libxphi 8

libxphi adds easy offload capabilities for blas3 and more

hfp/mpirun 8

MPIRUN wrapper script to generate and execute an MPIRUN command line.

hfp/cp2k 3

Quantum chemistry and solid state physics software package

hfp/tensorflow-xsmm 3

Improved performance for TensorFlow on Intel hardware.

hfp/dbcsr 1

DBCSR: Distributed Block Compressed Sparse Row matrix library

push eventhfp/dbcsr

Hans Pabst

commit sha 9776d960ca8802387bd6adc51f230894ac1e4929

Improved 2fb08dcd9c373d008a6494db55b71a9a1446a22d.

view details

push time in 10 hours

push eventhfp/cp2k

Frederick Stein

commit sha 7afab966ae005a6a43bd43a183caa6ab8a1b27ea

Fix gradients if group size>1

view details

Fabian Belleflamme

commit sha 32b7dab7e1af6bdd3d21c9cb69d14b18d69f5be4

Adjust regtests

view details

Hans Pabst

commit sha a2d440d3805949284290760c4d4881f512b0044d

Merge branch 'master' of https://github.com/cp2k/cp2k

view details

push time in 13 hours

push eventhfp/dbcsr

Hans Pabst

commit sha 8a92ebe837317276418749f44149338a6ecb5aa5

Corrected previous change.

view details

push time in 16 hours

push eventhfp/dbcsr

Hans Pabst

commit sha 2fb08dcd9c373d008a6494db55b71a9a1446a22d

Attempt to (quietly) backup every optimization step.

view details

push time in 16 hours

push eventhfp/cp2k

Frederick Stein

commit sha 6abef1d260126b8a1b743747facde915ba0a60b4

Cluster local_aL

view details

Frederick Stein

commit sha 23ccf9d2aac6d6d5e3b367a530741fd14dd1f623

Simplify interface to degenerate orbitals

view details

Frederick Stein

commit sha c4defecf01b19543d322eb2531fcb69ff44451e3

Count more DGEMM calls and communication steps in MP2 gradients

view details

Frederick Stein

commit sha ebc120905dd4c304252d9cf828160c25ca89215d

Use ASSOCIATE block

view details

Frederick Stein

commit sha 694dfb8876bfc605f3ce33fb90a57364e3eafad1

Use dgemm_counter for pdgemm-like multiplications, not just for every dgemm call

view details

Frederick Stein

commit sha d086d653c7c65ccb9fe6f9bf8fa5bd092a092daf

Refactor my_B_size Refactor Eigenval Refactor homo Refactor gd_B_virtual Refactor virtual Refactor my_B_virtual_* Refactor Bib_C Remove B_size_j Introduce kspin Refactor kspin Refactor num_ijk Refactor max_ijk Refactor ijk_map

view details

Frederick Stein

commit sha 04553c2946b28061c26b0d286ea55c0fb092d7c4

Refactor degenerate pairs

view details

Frederick Stein

commit sha b76bbc9a139226de35953d6ff0408172bf380464

Remove unnecessary variables

view details

Frederick Stein

commit sha 4a088f45254acec8fa9fe48757580c9ae8dbb15e

Add tag to sendrecv calls

view details

Frederick Stein

commit sha 321be3df7c6008ee49c78db03a4d0b3ef83c5511

Prevent communication if i==k or j==k

view details

Frederick Stein

commit sha 33e2602950f35dc19128ec72115241b5ebae68b6

OMP parallelize copy operations

view details

Frederick Stein

commit sha dd928c5c1efa41fb034307b933b57f05f43985a3

Obtain info about group dist only once

view details

Frederick Stein

commit sha 65f4561a0b1c0773a9c1916aab746f367aa6a7e9

Fix gradients with subgroups larger than 1

view details

Matthias Krack

commit sha a968e966bdea4893622b1a76f86677d6ba97dcd6

Adjust tolerance

view details

Ole Schütt

commit sha 7522f2590a695982c89549185a49bd32e596a974

Docker: Switch Gromacs to use pkgconfig

view details

Hans Pabst

commit sha 4db8babb878d4a0f5d4d53866f89ca5f978c0141

Merge branch 'master' of https://github.com/cp2k/cp2k

view details

push time in 17 hours

push eventhfp/dbcsr

Hans Pabst

commit sha bb3437b7a920ac1c3d08b76ce89b295efc2e9c01

Practically removed WG-parameter from all tuning levels.

view details

push time in 21 hours

push eventhfp/cp2k

Matthias Krack

commit sha 05cbb8b7eac91d6ee50523a5f30f2620c5b1f9af

Adjust tolerance

view details

Matthias Krack

commit sha dcb34f5d077e4d4338de4d81c49daf8a7b6fced4

Update conventions

view details

Matthias Krack

commit sha 4b1dd747d81f80c4ed1a01d687da9f165216ec37

Restrict default distance check to small systems

view details

Hans Pabst

commit sha a77069e0e05e13447697f38422b2470d2fb616cb

Merge branch 'master' of https://github.com/cp2k/cp2k

view details

push time in a day

push eventhfp/dbcsr

Hans Pabst

commit sha 7f6c854b38939b572e89f3f6fbce7b3df5ddd4d1

ocl: introduced tunables and other kernel adjustments (#498) * Adjusted default of intra-kernel batchsize (OPENCL_LIBSMM_SMM_BS). * Partially (fast/column-path) implemented BK-parameter (kernel). * Fixed decision about fallback-atomics (ATOMIC32_ADD64). * Thin abstraction (macros) of matrix-access (indexing). * OPENCL_LIBSMM_SMM_KERNEL env.var. load kernel source. * Introduced (tunable) k-blocks (peeled from m-loop). * Introduced (tunable) AT-parameter. Other: * Comdline arg for dir to read/write JSONs (tune_multiply.py). * Use os.path instead of string ops (tune_multiply.py). * Adjusted tuning-levels (tune_multiply.py).

view details

Hans Pabst

commit sha 2ace55518a115421e369cb7140c39bce3eef8d5c

Merge branch 'develop' of https://github.com/cp2k/dbcsr into develop

view details

push time in 3 days

push eventcp2k/dbcsr

Hans Pabst

commit sha 7f6c854b38939b572e89f3f6fbce7b3df5ddd4d1

ocl: introduced tunables and other kernel adjustments (#498) * Adjusted default of intra-kernel batchsize (OPENCL_LIBSMM_SMM_BS). * Partially (fast/column-path) implemented BK-parameter (kernel). * Fixed decision about fallback-atomics (ATOMIC32_ADD64). * Thin abstraction (macros) of matrix-access (indexing). * OPENCL_LIBSMM_SMM_KERNEL env.var. load kernel source. * Introduced (tunable) k-blocks (peeled from m-loop). * Introduced (tunable) AT-parameter. Other: * Comdline arg for dir to read/write JSONs (tune_multiply.py). * Use os.path instead of string ops (tune_multiply.py). * Adjusted tuning-levels (tune_multiply.py).

view details

push time in 3 days

PR merged cp2k/dbcsr

ocl: introduced tunables and other kernel adjustments
  • Adjusted default of intra-kernel batchsize (OPENCL_LIBSMM_SMM_BS).
  • Partially (fast/column-path) implemented BK-parameter (kernel).
  • Fixed decision about fallback-atomics (ATOMIC32_ADD64).
  • Thin abstraction (macros) of matrix-access (indexing).
  • OPENCL_LIBSMM_SMM_KERNEL env.var. load kernel source.
  • Introduced (tunable) k-blocks (peeled from m-loop).
  • Introduced (tunable) AT-parameter.

Other:

  • Comdline arg for dir to read/write JSONs (tune_multiply.py).
  • Use os.path instead of string ops (tune_multiply.py).
  • Adjusted tuning-levels (tune_multiply.py).
+356 -180

0 comment

4 changed files

hfp

pr closed time in 3 days

PR opened cp2k/dbcsr

ocl: introduced tunables and other kernel adjustments
  • Adjusted default of intra-kernel batchsize (OPENCL_LIBSMM_SMM_BS).
  • Partially (fast/column-path) implemented BK-parameter (kernel).
  • Fixed decision about fallback-atomics (ATOMIC32_ADD64).
  • Thin abstraction (macros) of matrix-access (indexing).
  • OPENCL_LIBSMM_SMM_KERNEL env.var. load kernel source.
  • Introduced (tunable) k-blocks (peeled from m-loop).
  • Introduced (tunable) AT-parameter.

Other:

  • Comdline arg for dir to read/write JSONs (tune_multiply.py).
  • Use os.path instead of string ops (tune_multiply.py).
  • Adjusted tuning-levels (tune_multiply.py).
+356 -180

0 comment

4 changed files

pr created time in 3 days

push eventhfp/dbcsr

Hans Pabst

commit sha fb0fba083542e4002c40fff3b687b910677a14c7

ocl: introduced tunables and other kernel adjustments * Adjusted default of intra-kernel batchsize (OPENCL_LIBSMM_SMM_BS). * Partially (fast/column-path) implemented BK-parameter (kernel). * Fixed decision about fallback-atomics (ATOMIC32_ADD64). * Thin abstraction (macros) of matrix-access (indexing). * OPENCL_LIBSMM_SMM_KERNEL env.var. load kernel source. * Introduced (tunable) k-blocks (peeled from m-loop). * Introduced (tunable) AT-parameter. Other: * Comdline arg for dir to read/write JSONs (tune_multiply.py). * Use os.path instead of string ops (tune_multiply.py). * Adjusted tuning-levels (tune_multiply.py).

view details

push time in 3 days

push eventhfp/cp2k

Dr. Mathieu Taillefumier

commit sha 4d3eda9b299f1be8fb6c830e5aabfceeee3a03d0

Fix initialization of string variables - string variable are C initialized which make fortran unhappy - explicitly set pseudo_potential or full_potential Signed-off-by: Dr. Mathieu Taillefumier <mathieu.taillefumier@free.fr>

view details

Matthias Krack

commit sha ca5348dbd1881dddde333fb3c3a2380559223240

Add format string as optional argument

view details

Matthias Krack

commit sha eb17c48a57659c8e9133b67007e72d214233dc48

Perform check of interatomic distances

view details

Matthias Krack

commit sha 169432927c55c5a6756daacd82f4c4164b2af3d4

Disable check of interatomic distances completely in tests with ghost atoms at atomic positions

view details

Matthias Krack

commit sha 40adc5980b4ef1a4897c8dcbf76695c4ca14d5c7

Compute interatomic distances only if needed

view details

Hans Pabst

commit sha 674bb0d422f14e9cee6e9724481e75caaccddc1c

Merge branch 'master' of https://github.com/cp2k/cp2k

view details

push time in 3 days

push eventhfp/dbcsr

Hans Pabst

commit sha ce685e61a0838e48880f0050835264c4ba682eab

Adjusted tuning-level.

view details

push time in 3 days

push eventhfp/dbcsr

Hans Pabst

commit sha 1f0a6db9fbef8addc9b29a48ade5e47a3fdfded9

Harder meta-programming. Code cleanup.

view details

push time in 3 days

push eventhfp/dbcsr

Hans Pabst

commit sha e0860d53a5b1a3d0afc3bef1cdd1f668926d3565

Fixed remainder calculation.

view details

push time in 3 days

push eventhfp/dbcsr

Hans Pabst

commit sha e1aefa0b98e98e6f8b40ed4ac4b0a465a156e89e

Correction in remainder calculation (still not correct).

view details

push time in 4 days

push eventhfp/dbcsr

Hans Pabst

commit sha 9fa54085e8be55f891d19ef7d44fde7c4f0cbf6e

Partially (fast/column-path) implemented BK-parameter (kernel).

view details

push time in 4 days

push eventhfp/dbcsr

Hans Pabst

commit sha 2cb3ec64557851e882fdc91423fea3dc13d877e3

Adjusted BK-parameter (glue code).

view details

push time in 4 days

push eventhfp/dbcsr

Hans Pabst

commit sha c650ce54618edb793059645163d558209896bfaf

Code cleanup.

view details

push time in 4 days

push eventhfp/dbcsr

Hans Pabst

commit sha 01b598b4ec52548fa11c1745c15cfaf7ab281c2d

Use os.path instead of manually building paths.

view details

push time in 4 days

pull request commenthfp/libxsmm

spgemm can handle unlimited unique numbers in matrix A

@alheinecke I suggest to merge right away when ready despite of some static analysis issues (I can fix master on Friday). Alternatively, I can pull @ChenyuZhang16 's work (if ready), fix, and merge (again on Friday). For the latter I need to know that this work is ready to merge aside from static analysis issues.

ChenyuZhang16

comment created time in 4 days

push eventhfp/dbcsr

Hans Pabst

commit sha 68199a901ed43f9e6eff9673b559fbc7fa991a59

Added missing parameter (AL).

view details

push time in 5 days

push eventhfp/dbcsr

Hans Pabst

commit sha a813bae6d7339f10556e18ec2b6cae1284e05960

Renamed AT-parameter to AL.

view details

push time in 5 days

push eventhfp/dbcsr

Hans Pabst

commit sha cc6217633d279412de4654632e3abcecc41625c0

Fixed condition.

view details

push time in 5 days

push eventhfp/dbcsr

Hans Pabst

commit sha 711231e616b4e5183ec0d799fcaf91db5f358178

Restrict AT to M=N=K.

view details

push time in 5 days

push eventhfp/dbcsr

Hans Pabst

commit sha 7f7a2a232c8604c3a96622a9f0a7e614d7d112ee

Code cleanup.

view details

push time in 5 days

push eventhfp/dbcsr

Hans Pabst

commit sha 5d3f4ff1a9c67d86a081181539482b94980613dc

Updated condition for ATOMIC_ADD2_GLOBAL (only valid with transposed access, i.e., linear access/update of global memory).

view details

push time in 5 days

push eventhfp/dbcsr

Hans Pabst

commit sha 4a8cd5ddc4e1b9dc1a59cdf75ffe8c028d74822c

Code cleanup.

view details

push time in 5 days

push eventhfp/dbcsr

Hans Pabst

commit sha 966b354eee505a602ed001ac334a059be89acb6d

Fixed decision about fallback-atomics (ATOMIC32_ADD64).

view details

push time in 5 days