profile
viewpoint

pfultz2/Cloak 677

A mini-preprocessor library to demostrate the recursive capabilites of the preprocessor

boostorg/hof 458

Higher-order functions for c++

pfultz2/cget 354

C++ package retrieval

pfultz2/ClangComplete 116

Clang completion for sublime

pfultz2/cmake-get 58

Get dependencies with cmake

pfultz2/args 49

Simple and type-safe commandline argument parser for C++14

pfultz2/awesome-cpp-1 40

A curated list of awesome C++ frameworks, libraries and software.

pfultz2/cget-recipes 14

Recipes for cget

crtrott/kokkos 4

Core repository for Kokkos software

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 23f6284443ff94f400b1d56a369042faeeee09f9

Add python 2 dev packages

view details

push time in 3 hours

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul Fultz II

commit sha e2cbb01eb182c4bbf67fff88078b6aaac8e2d091

Add support for constructing value from enum (#632) * Add support for constructing value from enum * Formatting

view details

Paul Fultz II

commit sha 9f283810a32b28402e3026242292dca64288fa27

Some perf improvements to bert (#627) * Fuse gemm in fuse ops * Formatting * Add const ref * Remove assert * Skip already fused gemms * Skip already fused gemm * Formatting * Use float_equal * Avoid non-standard shapes for inputs * Formatting Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

Shucai Xiao

commit sha 4fdc4dfe6f95249ec82ff7f9b4a215bc0a2cb0c3

Where op (#630) * add the where operator * clang format * add where unit tests * add where op unit test * clang format * add more unit tests for the where op * clang format * Add support for constructing value from enum * Formatting * add an comment about the algorithm * call make_op to create the convert instruction Co-authored-by: Paul <pfultz2@yahoo.com>

view details

Paul Fultz II

commit sha 24933bd801ddb50e46c035d7b92b8420976cbe03

Add pointwise attribute to operators (#634) * Add pointwise attribute * Formatting * Fix compilation * Remove unused variable * Formatting Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

Shucai Xiao

commit sha 8bf97a2ffb5620863495e0ec5db9ae29494f4eb4

Dockerfile download onnx models to enable real model unit tests (#628) * turn on the alexnet unit tests through downloading the real model in dockerfile * clang format * refine home directory * enable all real model unit tests * clang format * fix review comments and disable incorrect test cases * clang format * increase timeout value to avoid time out * turn on one more test * disable one more real model unit test * fix review comments * remove unnecessary comment lines * clang format * fix review comments * print out program info when there is an error * clang format * redirect c++ stdout to sys.stdout of python for python api * clang format * two minor issues * fix a cppcheck error * fix review comments * clang format * remove unnecessary changes * refine script

view details

Shucai Xiao

commit sha 16a03b39ea2760dbdef593529a5d0d3e62ffdfb6

Concat transpose bug (#638) * fix a bug related to concat transpose. * clang format * use return instruction to replace the fake instruction Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

Paul Fultz II

commit sha b83cd632f8f17e5d0a02d07a0f1a7e9f1f9b0d12

Improve API for program parameters (#635) * Take numpy array directly in python API * Formatting * Intialize program parameters from initializer list * Formatting Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

Paul Fultz II

commit sha 3bbaf664e513d1e2496af01dcc8daef43d0f44cd

Bump version

view details

Shucai Xiao

commit sha 549402479940b2a089281bea836cae75b7d14a3e

Update ort commit to latest ort changes (#643) Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

view details

Shucai Xiao

commit sha 48fa934d180cda7b6764b21465e203e39ca1cab3

Selu operator (#642) * code backup * clang format * support for sele operator * clang format * added an onnx unit test for selu * clang format * add more unit tests for the selu operation Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

push time in 4 hours

pull request commentROCmSoftwarePlatform/AMDMIGraphX

Added less and greater ops

Can you push this to branch instead of fork? For some reason our CI doesn't run on forks.

turneram

comment created time in 4 hours

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha d1178a1b61036bc7f1cce9bed7c4cd8cb2b0316d

Aggegrate result

view details

push time in 4 hours

create barnchROCmSoftwarePlatform/AMDMIGraphX

branch : report

created branch time in 4 hours

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 41ef514d22275db0c5aefa38117f7e795b63f5be

Add another const

view details

push time in 5 hours

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 190cf8f76857b5a0ffd9ab175706aed316fa6c42

Disable check

view details

push time in 5 hours

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 1fcf6b3cf9eb1457fa0b90413ea0b045ed5eede1

Rename stage

view details

push time in 5 hours

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha e2fed7af6a6816227386787bb6a1e5361a214e6b

Add more const

view details

Paul

commit sha 8c32076f349facdf5e3b4567fc86c0630171f803

Formatting

view details

push time in 5 hours

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha de0e8b157be270a84d541784f03b666e7376d1aa

Add clang-format

view details

push time in 17 hours

Pull request review commentROCmSoftwarePlatform/MIOpen

[igemm] Adding flexible layout support

 struct TensorDescriptor : miopenTensorDescriptor      std::string ToString() const; +    template <class Vector, class Op>+    inline std::vector<int64_t> sort_permutation(const Vector& data, Op op) const

This can be declared as static.

jerryyin

comment created time in 20 hours

PullRequestReviewEvent

Pull request review commentROCmSoftwarePlatform/MIOpen

[igemm] Adding flexible layout support

 struct TensorDescriptor : miopenTensorDescriptor      std::string ToString() const; +    template <class Vector, class Op>+    inline std::vector<int64_t> sort_permutation(const Vector& data, Op op) const+    {+        std::vector<std::int64_t> result(data.size());+        std::iota(result.begin(), result.end(), 0);+        std::sort(+            result.begin(), result.end(), [&](auto x, auto y) { return op(data[x], data[y]); });+        return result;+    }++    std::string GetLayout(std::string labels) const+    {+        if(labels.size() != strides.size())+        {+            MIOPEN_THROW(+                "Invalid labels size. Layout labels size must be equavalent to stride size");+        }++        auto result = labels;

Yes, this is to avoid calling push_back as incrementing a pointer should be faster.

jerryyin

comment created time in 20 hours

PullRequestReviewEvent

Pull request review commentROCmSoftwarePlatform/MIOpen

[igemm] Adding flexible layout support

 struct TensorDescriptor : miopenTensorDescriptor      std::string ToString() const; +    template <class Vector, class Op>+    inline std::vector<int64_t> sort_permutation(const Vector& data, Op op) const+    {+        std::vector<std::int64_t> result(data.size());+        std::iota(result.begin(), result.end(), 0);+        std::sort(+            result.begin(), result.end(), [&](auto x, auto y) { return op(data[x], data[y]); });+        return result;+    }++    std::string GetLayout(std::string labels) const

On the other hand, we can certainly cache it to a member variable (such that only the first time GetLayout() called that it is actually computed, otherwise it is just returning cached copies.).

I dont think this is a good idea. GetLayout(or ComputeLayout) should be const.

It would be better to store the layout string in the ConvolutionContext so that it only needs to be computed once.

jerryyin

comment created time in 20 hours

PullRequestReviewEvent

Pull request review commentROCmSoftwarePlatform/MIOpen

[igemm] Adding flexible layout support

 struct TensorDescriptor : miopenTensorDescriptor      std::string ToString() const; +    template <class Vector, class Op>+    inline std::vector<int64_t> sort_permutation(const Vector& data, Op op) const+    {+        std::vector<std::int64_t> result(data.size());+        std::iota(result.begin(), result.end(), 0);+        std::sort(+            result.begin(), result.end(), [&](auto x, auto y) { return op(data[x], data[y]); });+        return result;+    }++    std::string GetLayout(std::string labels) const

I think renaming it to ComputeLayout makes sense since it needs to do a caculation.

jerryyin

comment created time in 20 hours

PullRequestReviewEvent

Pull request review commentROCmSoftwarePlatform/MIOpen

Add MIOpenTensile in GEMM path

 RUN cget -p $PREFIX install RadeonOpenCompute/rocm-cmake@master RUN cget -p $PREFIX install pfultz2/rocm-recipes ADD min-requirements.txt /min-requirements.txt RUN CXXFLAGS='-isystem $PREFIX/include' cget -p $PREFIX install -f /min-requirements.txt-+RUN export HIPCC_LINK_FLAGS_APPEND='-O3 -parallel-jobs=4'+RUN export HIPCC_COMPILE_FLAGS_APPEND='-O3 -Wno-format-nonliteral -parallel-jobs=4'+# install last released miopentensile in default, install latest commits when MIOTENSILE_VER="latest"+RUN if [ "$MIOTENSILE_VER" = "latest" ] ; then cget -p $PREFIX install -DAMDGPU_TARGETS=${GPU_ARCH} ROCmSoftwarePlatform/MIOpenTensile@e258bbaed0bd9de546ea38c9a5c42f71fa41d9a0; else cget -p $PREFIX install -DAMDGPU_TARGETS=${GPU_ARCH} ROCmSoftwarePlatform/MIOpenTensile@2e3e792b2674bf8cdf2620749a298ed5313351bb; fi

-DAMDGPU_TARGETS=${GPU_ARCH} should be passed cget init. Also this should install tensile from a file so that hcc and hip-clang install the same version.

ce1adon

comment created time in 20 hours

PullRequestReviewEvent

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha f17483d02f4798b1b83ca6f180a9d4029519c811

Link threads in the main .so

view details

push time in 20 hours

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul Fultz II

commit sha 24933bd801ddb50e46c035d7b92b8420976cbe03

Add pointwise attribute to operators (#634) * Add pointwise attribute * Formatting * Fix compilation * Remove unused variable * Formatting Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

Shucai Xiao

commit sha 8bf97a2ffb5620863495e0ec5db9ae29494f4eb4

Dockerfile download onnx models to enable real model unit tests (#628) * turn on the alexnet unit tests through downloading the real model in dockerfile * clang format * refine home directory * enable all real model unit tests * clang format * fix review comments and disable incorrect test cases * clang format * increase timeout value to avoid time out * turn on one more test * disable one more real model unit test * fix review comments * remove unnecessary comment lines * clang format * fix review comments * print out program info when there is an error * clang format * redirect c++ stdout to sys.stdout of python for python api * clang format * two minor issues * fix a cppcheck error * fix review comments * clang format * remove unnecessary changes * refine script

view details

Shucai Xiao

commit sha 16a03b39ea2760dbdef593529a5d0d3e62ffdfb6

Concat transpose bug (#638) * fix a bug related to concat transpose. * clang format * use return instruction to replace the fake instruction Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

Paul Fultz II

commit sha b83cd632f8f17e5d0a02d07a0f1a7e9f1f9b0d12

Improve API for program parameters (#635) * Take numpy array directly in python API * Formatting * Intialize program parameters from initializer list * Formatting Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

Paul Fultz II

commit sha 3bbaf664e513d1e2496af01dcc8daef43d0f44cd

Bump version

view details

Shucai Xiao

commit sha 549402479940b2a089281bea836cae75b7d14a3e

Update ort commit to latest ort changes (#643) Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

view details

Shucai Xiao

commit sha 48fa934d180cda7b6764b21465e203e39ca1cab3

Selu operator (#642) * code backup * clang format * support for sele operator * clang format * added an onnx unit test for selu * clang format * add more unit tests for the selu operation Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

Paul Fultz II

commit sha db6db730c78a8cabe6e6814842a7b828a5e5b8b0

Merge branch 'develop' into debug-libcpp

view details

push time in 21 hours

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha b464a895ad452f8b0c1c18eae266ac30f978b791

Remove comments

view details

push time in 21 hours

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 1f2c0434a13830c9cfb9374d93b1b3a39686ef6c

Add hip-clang docker builds

view details

push time in 21 hours

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 663dc52750d2460f569525da0e1674406845a60b

Add more const methods

view details

push time in 21 hours

delete branch ROCmSoftwarePlatform/AMDMIGraphX

delete branch : rocm-3.9.x

delete time in a day

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 3039e30b2e4b709303f3d71587c6764c8c38e2dd

Add missing body variable

view details

push time in a day

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha bb1f67497b8466c85e4963c608d5a9b357b881a9

Fix shadow

view details

push time in a day

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 178d1f0a6548ea0a7759b35047dd40edc34f1b18

Refactor jenkinsfile

view details

push time in a day

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha dd4eee5596d39bbfa6e57ed7279aa410ca80431c

Use named parameters

view details

push time in a day

push eventROCmSoftwarePlatform/AMDMIGraphX

Shucai Xiao

commit sha 93be5e2b1cb064b253eced3b6d12496448f3ee81

Bert fuse slice reshape trans contiguous (#542) * fix pad calc * Add decompose pass * Add decompose test * Formatting * bert tf passes correctness * formatting * Add remap * Formatting * add test * formatting * remove comment * Add compute method for dot * Formatting * add inline * Add finder for horizontal fusion * Formatting * Formatting * Reuse predicate * formatting * fix order for literal * formatting * add test for gelu * formatting * added add_gelu fusion * Add gemm fusions * Formatting * add files * formatting * test no mul_add * formatting * progress on div * formatting * continue work on pass * remove layernorm opt * revert reduce file * Add some fixes for convolution * Formatting * Fix shape tests * Formatting * Reuse axis equal * Add initial split fusion * Formatting * Update offset * Workaround outputs that cant accept nonstandard shapes * Formatting * Add check for split concat * Formatting * Add missing headers * Formatting * Add tests * Formatting * add optimization for bert * code backup for bert optimization * continue testing * formatting * fix matcher * formatting * add gelu_fn and tests * formatting * fix matcher, remove extra tests * formatting * fix matcher * add missing files * add find_layernorm * add add_transpose to cmake file * code backup for the contigous fusion * refine ops fusion * clang format * fixed bug in previous optimization * clang format * add more optimization * remove unnecessary code * refinement of the fustion code * clang format * fixed a bug * add used_once * formatting * start on new gelu * formatting * add matchers in fuse_ops * formatting * add dce to fix add_gelu * add simplify_rsqrt and test * formatting * debugging value for matcher * formatting * add more to matchers * formatting * fix errors * remove onnx gen * add any_arg, change matchers to use either_arg * formatting * clang format * formatting * add used_once * formatting * code cleanup * clang format * fixed a bug * remove unnecessary code * refine comments * optimize bert to remove more contiguous * clang format * remove unnecessary code * add unit tests for bert optimization * clang format * fix review comments * clang format * refine a fusion of reshape and slice * clang format * fix cppcheck error * fix review comments * add the fusion of slice and transpose * clang format * add another optimization to fuse slice and transpose * clang format * fix review comments * clang format * fix review comments * clang format * fix review comments Co-authored-by: Khalique <15948690+kahmed10@users.noreply.github.com> Co-authored-by: Paul <pfultz2@yahoo.com> Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com> Co-authored-by: Shucai Xiao <scxiao@prj47-rack-99.local.lan>

view details

kahmed10

commit sha cb722cf9e3591355c68495921b89148d08c015fc

Enable read support for n-dimensional ops (#537) * initial progress * formatting * add pooling changes * formatting * change eliminate_pad * formatting * rename var * fomratting * update op shape test and compute * formatting * revert conv constructor * formatting * change initializer * formatting * fix tidy * change quant conv and shape check * add tests and fixes * formatting * fix type * fix conv test * formatting * add pooling and bn tests * formatting * add inconsistent attr tests * fix padding issue * formatting * fix review comments, remove duplicate test * formatting * fix variable * fix assert bug * fix attr check * remove std Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

kahmed10

commit sha 39dca6ac4f60877e557264dfc72e9ee4b5c47b3a

Embedding bag op (limited support) (#545) * initial progress on embedding_bag * formatting * add test * move enum * formatting * fix tidy * improve test for more coverage * formatting * update arg and test * formatting * add more tests * formatting * fix enum * formatting

view details

Shucai Xiao

commit sha c89c90db5ed145a79640c5dbc8d4d22d7839b8d8

Pooling_nd_cpu_implementation (#548) * initial progress * formatting * add pooling changes * formatting * change eliminate_pad * formatting * rename var * fomratting * update op shape test and compute * formatting * revert conv constructor * formatting * change initializer * formatting * fix tidy * change quant conv and shape check * add tests and fixes * formatting * fix type * fix conv test * formatting * add pooling and bn tests * formatting * add inconsistent attr tests * fix padding issue * formatting * fix review comments, remove duplicate test * formatting * fix variable * fix assert bug * fix attr check * remove std * nd pooling cpu implementation * add unit test for 1d and 3d pooling operator * add more unit test for avareage pooling * add pooling unit tests for cpu implementation * clang format * fix cppcheck error * clang format Co-authored-by: Khalique <15948690+kahmed10@users.noreply.github.com>

view details

Paul

commit sha c5d5c131c80f66e753a22197fd5dfa79ae816d78

Bump version

view details

Paul

commit sha b72dac968bfa342c6d44b377b3599c007d55af63

Merge

view details

Paul Fultz II

commit sha 59e36b7243e3a473fd1261f799431191bb1584a4

Test with onnx runtime (#552) * Build and test onnxrt * Add sudo command * Add sudo * Add pkgconfig * Make root user * Move unstash out * Remove noncps * Add NonCPS back * Remove all noncps * Use each method * Move unstash command * Unstash before * Move stash command up * Move unstash to noncps function * Remove noncps * Use a function to unstash * Remove call to unused function * Change order of args * Add another rocmtestnode overload * List files * Use capital R * Search in build directory * Use force * Use newer cmake with onnx * Install requirements * Print out pip list * Install pip3 * Add cxxflags for hip * Generate locale * Install wheel with pip3 * Disable pip installation * Disable build wheel

view details

kahmed10

commit sha 1cc724eec3e4037a14f5d872df6c21e776b6c880

ND convolution GPU support (#550) * initial progress * formatting * add pooling changes * formatting * change eliminate_pad * formatting * rename var * fomratting * update op shape test and compute * formatting * revert conv constructor * formatting * change initializer * formatting * fix tidy * change quant conv and shape check * add tests and fixes * formatting * fix type * fix conv test * formatting * add pooling and bn tests * formatting * add inconsistent attr tests * fix padding issue * formatting * progress on 1d to 2d * formatting * change compute and compile functions * formatting * fix duplicate * fix conflict * fix issue with 1d conv * formatting * add check for 3d limit * rename function * formatting * rename functions * formatting * add op_shape test * change functions * formatting * change to copy_backward * formatting

view details

Shucai Xiao

commit sha dd6523c90f277d465e6ccaf6b1b4fe5ddd857350

Gather elements operator (#549) * code backup * clang format * fix compiling errors * clang format * rename a few files * rename a few files * fix variable bugs * clang format * add an operator to shift input sequences * clang format * fixed a bug * clang format * fixed a bug * clang format * code backup * clang format * code backup * clang format * code backup * clang format * refine code related lstm operator optimization * clang format * fix various bugs * clang format * fixed a bug in rewrite_lstm * clang format * fixed another bug * refine two operator names * clang format * refine file names * fix cppcheck error * clang format * fix cppcheck error * clang format * fix cppcheck error * fixed review comments * clang format * add unit tests * clang format * add unit tests * clang format * refine unit tests for better coverage * clang format * fixed a bug * fix cppcheck error * fix review comments * clang format * rename two operators according to review comments * clang format * add parsing the operator GatherElements * clang format * add onnx unit tests for the gather_elments operator * clang format * clang format * remove unnecessary files * remove unnecessary files * add a verify onnx unit test for the gather element operator * clang format Co-authored-by: Shucai Xiao <scxiao@prj47-rack-99.local.lan> Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

Shucai Xiao

commit sha 866cca5be094f295b6de55e5dc6728a626abf7ba

Neg operator (#557) * add the neg operator * clang format * add missing operator * fixed a cppcheck error * change to use the neg operator * clang format

view details

Shucai Xiao

commit sha 158bf57ce47239ffc7f55c2e1e721321711dbffe

Add quantization c api (#547) * add quantization_fp16 c api * clang format * add quantization c api * clang format * backup code for add_c_api of quantization * add c/c++ api for the quantization * clang format * fix a cppcheck error * fix cpp check error * add unit test for quantization apis * clang format * fix cppcheck error * clang format * refine unit tests to cover more code changes * clang format * refine unit tests for more code change coverage * add an op_names class * clang format * refine a unit test for more code change coverage * code backup * clang format * remove unnecessary code * fix review comments * clang format Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

kahmed10

commit sha 742b4b82cd1fddcbd185d50cd56fcd9f2e4d0527

Nd deconv read support (#554) * initial progress * formatting * check existing tests * formatting * change for loop to transform * formatting * add tests * formatting * remove comment * add more tests * remove extra slice axes Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

Shucai Xiao

commit sha 61cbe923c961ff8bf7b694f3bce8ab0127f9ff70

Cpu batchnorm (#562) * change the batchnorm cpu implementation to support multiple input dimensions * clang format * add unit tests for cpu batch_norm nd implementation * clang format Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

kahmed10

commit sha 61d49048b4b4b63576ed271b25edf53c5c36564d

update for debug info in documentation (#530) * update for debug info * Update README.md * Update README.md * Update README.md Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

Shucai Xiao

commit sha 58e1fef7997146adbdd129d2657c9f176096df28

update ort commit hash to prioritize MIGraphX EP in unit test (#566) Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

kahmed10

commit sha a5fb837d11302f4480acf19f673de3655907b240

Nd deconv GPU support (#558) * initial progress * formatting * check existing tests * formatting * change for loop to transform * formatting * add tests * formatting * remove comment * add more tests * update gpu miopen calls * formatting * fix error msg Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

Shucai Xiao

commit sha 8ca7b140a0d6ea0534e6066639eae5807f52c4f5

Bert squad eliminate contiguous (#567) * code backup * clang format * refine the algorithm to support more scenarios * clang format * fix review comments * clang format * add one more unit tests to have more code change coverage Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

kahmed10

commit sha d1258e805aee557612a96b6d75a80a573d39fc33

Nd pooling gpu (#551) * initial progress * formatting * add pooling changes * formatting * change eliminate_pad * formatting * rename var * fomratting * update op shape test and compute * formatting * revert conv constructor * formatting * change initializer * formatting * fix tidy * change quant conv and shape check * add tests and fixes * formatting * fix type * fix conv test * formatting * add pooling and bn tests * formatting * add inconsistent attr tests * fix padding issue * formatting * progress on 1d to 2d * formatting * change compute and compile functions * formatting * fix duplicate * fix conflict * fix issue with 1d conv * formatting * add check for 3d limit * rename function * formatting * update to MIOPen 2.3 * add support for nd pooling * formatting * test miopen 2.4 * change function name * rename functions * formatting * add op_shape test * add gpu ops tests * formatting * add pkg-config * change functions * formatting * change to copy_backward * formatting * test diff miopen version * add pooling shape tests * temp disable test * revert to miopen 2.4 Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

Paul Fultz II

commit sha 5cc6e1607debef812a2137904d44a02fb7c7025d

Add conv ND for cpu (#561) * Initial cpu conv-nd * Formatting * Make index signed * Formatting * Assert the indices are greater than 0 * Use equal instead of lexicographical_compare * Formatting * Fix tidy errors * Formatting * Handle different types * Formatting * Fix nested visits * Formatting * Add 3d conv test * Formatting * revert unnecessary changes * remove a print line * Fix ICE * Formatting * Use absolute path Co-authored-by: Shucai Xiao <shucai.xiao@amd.com> Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

Paul Fultz II

commit sha 9d16eaca84fefbabae29a8e5eebb0ea67799475f

Fix compile error with no dpp reductions (#571) Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

push time in a day

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 0dd98e12551c1f86d3b759c310627b6f3e75658c

Add auto register header

view details

push time in a day

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 8b282da1a15126ae1f3face079bfa3e10b2b6c70

Use named param for node

view details

push time in a day

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha f8d754fd2f7bb7a15e239757cd16bc25d1107a99

Remove noncps

view details

push time in a day

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 98beae6650e7de4f9d82c0cc4f32689e7ffb2522

Add overload

view details

push time in a day

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha b7a1d4db41ac1dc0a3d4afeef466b84cc62339e3

Remove overload

view details

push time in a day

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 51a9a709d12f4cacbc89b5c3bd1bb9d6c18c8199

Use named parameters

view details

push time in a day

push eventpfultz2/cppcheck

Paul

commit sha 58e86ba3a708a75799363d28e2fbb23efdb3e2d4

Format

view details

push time in 4 days

PR opened danmar/cppcheck

Fix issue 9916: False positive: duplicateAssignExpression when it's checked if variables have initial value later

This is a partial fix. Its makes the warning in issue 9916 inconclusive. We could look into removing the inconclusive warning in a future PR, but I think we should add another flag to enable more aggressive copy-paste detection.

+45 -10

0 comment

2 changed files

pr created time in 4 days

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 51f80de91dc6fc0a6341d3430ede90a44f5df747

Fix nodiscard warnings

view details

Paul

commit sha 8ee1e5b994fd5f8f67aebc476b0a2eb31fc286d5

Formatting

view details

push time in 4 days

create barnchpfultz2/cppcheck

branch : duplicate-assign-expression-equal

created branch time in 4 days

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 30d3d021237215e4fa350c62682236633c8a6d7b

Fix rocblas function call

view details

Paul

commit sha bd0a76567793384586a71ea5eda679117da3af21

Formatting

view details

push time in 4 days

create barnchROCmSoftwarePlatform/AMDMIGraphX

branch : hip-clang

created branch time in 4 days

PR opened danmar/cppcheck

Add emscripten cfg

This adds the EM_JS and EM_ASM macros which takes javascript.

+5 -0

0 comment

1 changed file

pr created time in 4 days

push eventpfultz2/cppcheck

Paul

commit sha 24e09ff030c2a3f89fcf6148ebee996efca5cd38

Add emscripten cfg

view details

push time in 4 days

create barnchpfultz2/cppcheck

branch : emscripten-cfg

created branch time in 4 days

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 6d2a656da0d49e50b68a408858a780d8517d9d95

Add test for scalar

view details

Paul

commit sha fd0accdeecdeac2d44bc50fad8a920b5a75f83de

Formatting

view details

push time in 4 days

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 72e29d4916406373e03fc91aefd73a4bafb5b95e

Make sure inputs are standard inputs

view details

Paul

commit sha c61afab175ad366639b811c5916a4af0428cd285

Formatting

view details

push time in 4 days

issue commentROCmSoftwarePlatform/AMDMIGraphX

Update Python interface to build Python 3.* versions simultaneously.

Here is the related pybind11 issue. It looks like it was closed because of this PR, which adds a PYBIND11_NOPYTHON so that multiple python versions can be targeted.

mvermeulen

comment created time in 5 days

PullRequestReviewEvent

pull request commentROCmSoftwarePlatform/AMDMIGraphX

Selu operator

This can be merged, the deepcode failure should just be ignored.

scxiao

comment created time in 5 days

Pull request review commentROCmSoftwarePlatform/AMDMIGraphX

Add build flag for fast math

+#include <test.hpp>+#include <migraphx/quantization.hpp>+#include <migraphx/iterator_for.hpp>+#include "test_utils.hpp"+#include "test.hpp"++migraphx::program create_gelu()+{+    migraphx::program p;+    std::vector<float> data0 = {0.044715};+    std::vector<float> data1 = {0.797885};+    std::vector<float> data2 = {3};+    std::vector<float> data3 = {0.5};+    migraphx::shape s0{migraphx::shape::float_type, {1}};++    std::vector<size_t> x_dims{1, 1, 5};++    auto x         = p.add_parameter("x", migraphx::shape{migraphx::shape::float_type, x_dims});+    auto const_val = p.add_literal(migraphx::literal{s0, data0});+    auto sqrt_2_pi = p.add_literal(migraphx::literal{s0, data1});+    auto three_val = p.add_literal(migraphx::literal{s0, data2});+    auto half_val  = p.add_literal(migraphx::literal{s0, data3});++    auto mbcast_3         = p.add_instruction(migraphx::op::multibroadcast{x_dims}, three_val);+    auto pow_op           = p.add_instruction(migraphx::op::pow{}, x, mbcast_3);+    auto mbcast_const     = p.add_instruction(migraphx::op::multibroadcast{x_dims}, const_val);+    auto mul_const        = p.add_instruction(migraphx::op::mul{}, mbcast_const, pow_op);+    auto add_x            = p.add_instruction(migraphx::op::add{}, x, mul_const);+    auto mbcast_sqrt_2_pi = p.add_instruction(migraphx::op::multibroadcast{x_dims}, sqrt_2_pi);+    auto mul_add_x        = p.add_instruction(migraphx::op::mul{}, mbcast_sqrt_2_pi, add_x);+    auto tanh_op          = p.add_instruction(migraphx::op::tanh{}, mul_add_x);+    auto mbcast_half      = p.add_instruction(migraphx::op::multibroadcast{x_dims}, half_val);+    auto mul_half         = p.add_instruction(migraphx::op::mul{}, mbcast_half, tanh_op);+    auto add_mul_half     = p.add_instruction(migraphx::op::add{}, mul_half, mbcast_half);+    auto mul_x            = p.add_instruction(migraphx::op::mul{}, x, add_mul_half);+    p.add_return({mul_x});++    return p;+}++void check_gelu_version(const migraphx::program& p, const std::string& gelu_name)+{+    bool found_gelu = false;+    for(auto ins : iterator_for(p))+    {+        if(ins->name() == gelu_name)+            found_gelu = true;+    }+    CHECK(found_gelu);+}++TEST_CASE(enable_fast_gelu)+{+    migraphx::program p = create_gelu();+    p.compile(migraphx::gpu::target{});+    check_gelu_version(p, "gpu::gelu");

This could be written with any_of:

CHECK(any_of(p, [&](auto&& i) { return i.name() == "gpu::gelu"; });
kahmed10

comment created time in 5 days

PullRequestReviewEvent

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul Fultz II

commit sha 24933bd801ddb50e46c035d7b92b8420976cbe03

Add pointwise attribute to operators (#634) * Add pointwise attribute * Formatting * Fix compilation * Remove unused variable * Formatting Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

Shucai Xiao

commit sha 8bf97a2ffb5620863495e0ec5db9ae29494f4eb4

Dockerfile download onnx models to enable real model unit tests (#628) * turn on the alexnet unit tests through downloading the real model in dockerfile * clang format * refine home directory * enable all real model unit tests * clang format * fix review comments and disable incorrect test cases * clang format * increase timeout value to avoid time out * turn on one more test * disable one more real model unit test * fix review comments * remove unnecessary comment lines * clang format * fix review comments * print out program info when there is an error * clang format * redirect c++ stdout to sys.stdout of python for python api * clang format * two minor issues * fix a cppcheck error * fix review comments * clang format * remove unnecessary changes * refine script

view details

Shucai Xiao

commit sha 16a03b39ea2760dbdef593529a5d0d3e62ffdfb6

Concat transpose bug (#638) * fix a bug related to concat transpose. * clang format * use return instruction to replace the fake instruction Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

Paul Fultz II

commit sha b83cd632f8f17e5d0a02d07a0f1a7e9f1f9b0d12

Improve API for program parameters (#635) * Take numpy array directly in python API * Formatting * Intialize program parameters from initializer list * Formatting Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

Paul Fultz II

commit sha 3bbaf664e513d1e2496af01dcc8daef43d0f44cd

Bump version

view details

Shucai Xiao

commit sha 549402479940b2a089281bea836cae75b7d14a3e

Update ort commit to latest ort changes (#643) Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

view details

Paul Fultz II

commit sha 989e7cda18780146ff579cdd2b56ebf3fcafd6d4

Merge branch 'develop' into analyze-streams

view details

push time in 5 days

Pull request review commentROCmSoftwarePlatform/AMDMIGraphX

Add parallel stream analysis

+#include <migraphx/analyze_streams.hpp>+#include <migraphx/program.hpp>+#include <migraphx/iterator_for.hpp>+#include <migraphx/ranges.hpp>+#include <migraphx/instruction.hpp>+#include <migraphx/errors.hpp>++namespace migraphx {+inline namespace MIGRAPHX_INLINE_NS {++bool happens_before(const std::vector<std::size_t>& e1, const std::vector<std::size_t>& e2)+{+    return std::equal(e1.begin(), e1.end(), e2.begin(), e2.end(), std::less_equal<>{}) and+           not std::equal(e1.begin(), e1.end(), e2.begin(), e2.end(), std::greater_equal<>{});

The logic is that all of them should be less than equal and at least one of them is strictly less than. See here.

pfultz2

comment created time in 5 days

PullRequestReviewEvent

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 44ac51b67cf42ff426c4a1874240bd13d4a2b9a8

Only validate matches when env var enables it

view details

Paul

commit sha c81843bf8edbbceaa48970f8763949641e759196

Formatting

view details

push time in 5 days

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 5c17510a2e78f151000ef009b21c5b65acc5e102

Make sure splits are only used once

view details

Paul

commit sha c7b20a2ae093b79a5f0b3916d21c4233ebc3ea3e

Formatting

view details

push time in 5 days

Pull request review commentROCmSoftwarePlatform/AMDMIGraphX

Op constructor c/python api

+#include <algorithm>+#include <string>+#include <vector>+#include <functional>+#include <sstream>+#include <migraphx/errors.hpp>+#include <migraphx/ranges.hpp>+#include "convert_to_json.hpp"++namespace migraphx {+inline namespace MIGRAPHX_INLINE_NS {++using token = std::pair<const char*, const char*>;+using lexer = std::function<const char*(const char* start, const char* end)>;++template <class P>+auto lex_while(P p)+{+    return [=](const char* start, const char* end) {+        return std::find_if(start, end, [&](char c) { return not p(c); });+    };+}++template <class P>+auto lex_if(P p)+{+    return [=](const char* start, const char*) {+        if(p(*start))+            return start + 1;+        return start;+    };+}++std::vector<token> tokenize(const char* start, const char* end, std::vector<lexer> lexers)+{+    std::vector<token> result;+    while(start != end)+    {+        bool error = true;+        for(auto l : lexers)+        {+            auto next = l(start, end);+            if(next != start)+            {+                if(not std::all_of(start, next, &isspace))

This condition should be removed, so we dont strip whitespace in the output.

scxiao

comment created time in 5 days

PullRequestReviewEvent

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 20491bc5b09f3db4ff1bf779ef760edeab1693b2

Check operator order

view details

Paul

commit sha 5c4d97006c0a8e117e7e96055c1e3bffaff87eca

Formatting

view details

push time in 5 days

Pull request review commentROCmSoftwarePlatform/AMDMIGraphX

Add build flag for fast math

 void manual_identity()     std::cout << result << std::endl; } +void manual_disable_fast_gelu()+{+    migraphx::program p;+    std::vector<float> data0 = {0.044715};+    std::vector<float> data1 = {0.797885};+    std::vector<float> data2 = {3};+    std::vector<float> data3 = {0.5};+    migraphx::shape s0{migraphx::shape::float_type, {1}};++    std::vector<size_t> x_dims{1, 1, 5};++    auto x         = p.add_parameter("x", migraphx::shape{migraphx::shape::float_type, x_dims});+    auto const_val = p.add_literal(migraphx::literal{s0, data0});+    auto sqrt_2_pi = p.add_literal(migraphx::literal{s0, data1});+    auto three_val = p.add_literal(migraphx::literal{s0, data2});+    auto half_val  = p.add_literal(migraphx::literal{s0, data3});++    auto mbcast_3         = p.add_instruction(migraphx::op::multibroadcast{x_dims}, three_val);+    auto pow_op           = p.add_instruction(migraphx::op::pow{}, x, mbcast_3);+    auto mbcast_const     = p.add_instruction(migraphx::op::multibroadcast{x_dims}, const_val);+    auto mul_const        = p.add_instruction(migraphx::op::mul{}, mbcast_const, pow_op);+    auto add_x            = p.add_instruction(migraphx::op::add{}, x, mul_const);+    auto mbcast_sqrt_2_pi = p.add_instruction(migraphx::op::multibroadcast{x_dims}, sqrt_2_pi);+    auto mul_add_x        = p.add_instruction(migraphx::op::mul{}, mbcast_sqrt_2_pi, add_x);+    auto tanh_op          = p.add_instruction(migraphx::op::tanh{}, mul_add_x);+    auto mbcast_half      = p.add_instruction(migraphx::op::multibroadcast{x_dims}, half_val);+    auto mul_half         = p.add_instruction(migraphx::op::mul{}, mbcast_half, tanh_op);+    auto add_mul_half     = p.add_instruction(migraphx::op::add{}, mul_half, mbcast_half);+    auto mul_x            = p.add_instruction(migraphx::op::mul{}, x, add_mul_half);+    p.add_return({mul_x});++    migraphx::compile_options options;+    options.fast_math = false;+    p.compile(migraphx::gpu::target{}, options);+    bool found_gelu_new = false;+    for(auto ins : iterator_for(p))+    {+        if(ins->name() == "gpu::gelu_new")+            found_gelu_new = true;+    }+    CHECK(found_gelu_new);+}

Also, there should be two tests, one with it enabled and another with it disabled.

kahmed10

comment created time in 5 days

PullRequestReviewEvent

Pull request review commentROCmSoftwarePlatform/AMDMIGraphX

Add build flag for fast math

 void manual_identity()     std::cout << result << std::endl; } +void manual_disable_fast_gelu()+{+    migraphx::program p;+    std::vector<float> data0 = {0.044715};+    std::vector<float> data1 = {0.797885};+    std::vector<float> data2 = {3};+    std::vector<float> data3 = {0.5};+    migraphx::shape s0{migraphx::shape::float_type, {1}};++    std::vector<size_t> x_dims{1, 1, 5};++    auto x         = p.add_parameter("x", migraphx::shape{migraphx::shape::float_type, x_dims});+    auto const_val = p.add_literal(migraphx::literal{s0, data0});+    auto sqrt_2_pi = p.add_literal(migraphx::literal{s0, data1});+    auto three_val = p.add_literal(migraphx::literal{s0, data2});+    auto half_val  = p.add_literal(migraphx::literal{s0, data3});++    auto mbcast_3         = p.add_instruction(migraphx::op::multibroadcast{x_dims}, three_val);+    auto pow_op           = p.add_instruction(migraphx::op::pow{}, x, mbcast_3);+    auto mbcast_const     = p.add_instruction(migraphx::op::multibroadcast{x_dims}, const_val);+    auto mul_const        = p.add_instruction(migraphx::op::mul{}, mbcast_const, pow_op);+    auto add_x            = p.add_instruction(migraphx::op::add{}, x, mul_const);+    auto mbcast_sqrt_2_pi = p.add_instruction(migraphx::op::multibroadcast{x_dims}, sqrt_2_pi);+    auto mul_add_x        = p.add_instruction(migraphx::op::mul{}, mbcast_sqrt_2_pi, add_x);+    auto tanh_op          = p.add_instruction(migraphx::op::tanh{}, mul_add_x);+    auto mbcast_half      = p.add_instruction(migraphx::op::multibroadcast{x_dims}, half_val);+    auto mul_half         = p.add_instruction(migraphx::op::mul{}, mbcast_half, tanh_op);+    auto add_mul_half     = p.add_instruction(migraphx::op::add{}, mul_half, mbcast_half);+    auto mul_x            = p.add_instruction(migraphx::op::mul{}, x, add_mul_half);+    p.add_return({mul_x});++    migraphx::compile_options options;+    options.fast_math = false;+    p.compile(migraphx::gpu::target{}, options);+    bool found_gelu_new = false;+    for(auto ins : iterator_for(p))+    {+        if(ins->name() == "gpu::gelu_new")+            found_gelu_new = true;+    }+    CHECK(found_gelu_new);+}

This should be put in a different cpp file.

kahmed10

comment created time in 5 days

PullRequestReviewEvent

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul

commit sha 4b9e79d25d20774e7811b1d0440014c4db08d607

Fix find splits

view details

Paul

commit sha b970cdf1aeb068d7440db006da307064b9cd01b1

Formatting

view details

push time in 6 days

delete branch pfultz2/cppcheck

delete branch : fp-known-empt-container-global-scope

delete time in 6 days

delete branch pfultz2/cppcheck

delete branch : fp-valueflow-ternary-before-condition

delete time in 6 days

Pull request review commentROCmSoftwarePlatform/AMDMIGraphX

Op constructor c/python api

+#include "json_tokenize.hpp"+#include <algorithm>+#include <string>+#include <vector>+#include <iostream>+#include <migraphx/errors.hpp>++namespace migraphx {+inline namespace MIGRAPHX_INLINE_NS {++siter colon(siter start, siter end)+{+    return std::find_if(start, end, [](auto c) { return c == ':'; });+}++std::pair<siter, siter> key(siter start, siter end)+{+    // find key end+    --end;

Maybe add a assert(*end == ':')

scxiao

comment created time in 6 days

PullRequestReviewEvent

Pull request review commentROCmSoftwarePlatform/AMDMIGraphX

Op constructor c/python api

+#include "json_tokenize.hpp"+#include <algorithm>+#include <string>+#include <vector>+#include <iostream>+#include <migraphx/errors.hpp>++namespace migraphx {+inline namespace MIGRAPHX_INLINE_NS {++siter colon(siter start, siter end)+{+    return std::find_if(start, end, [](auto c) { return c == ':'; });+}++std::pair<siter, siter> key(siter start, siter end)+{+    // find key end+    --end;+    while(start != end)+    {+        if(*end != ' ')+        {+            break;+        }+        --end;+    }+    auto ke = end;++    if(start == end)+    {+        if(*ke != '\"')+        {+            return {ke, ke};+        }+        else+        {+            MIGRAPHX_THROW("KEY: single quote cannot be a key!");+        }+    }++    // find key start+    --end;+    while(start != end)+    {+        // match+        if(*ke == '\"' and *end == '\"')+        {+            break;+        }+        else if(*ke != '\"' and (std::ispunct(*end) or *end == ' '))+        {+            ++end;+            break;+        }+        --end;+    }

This loop could be written as this:

if(*ke == '\"')
{
    end = std::find_if(std::make_reverse_iterator(end), std::make_reverse_iterator(start), [](char c) {
        return c == '\"';
    }).base();
}
else
{
    end = std::find_if(std::make_reverse_iterator(end), std::make_reverse_iterator(start), [](char c) {
        return std::ispunct(c) or c == ' ';
    }).base();
    end++;
}

The *ke condition is moved out of the loop, and we use std::find_if.

scxiao

comment created time in 6 days

PullRequestReviewEvent

Pull request review commentROCmSoftwarePlatform/AMDMIGraphX

Op constructor c/python api

+#include "json_tokenize.hpp"+#include <algorithm>+#include <string>+#include <vector>+#include <iostream>+#include <migraphx/errors.hpp>++namespace migraphx {+inline namespace MIGRAPHX_INLINE_NS {++siter colon(siter start, siter end)+{+    return std::find_if(start, end, [](auto c) { return c == ':'; });+}++std::pair<siter, siter> key(siter start, siter end)+{+    // find key end+    --end;+    while(start != end)+    {+        if(*end != ' ')+        {+            break;+        }+        --end;

This could be written with find_if and reverse iterators.

scxiao

comment created time in 6 days

PullRequestReviewEvent

Pull request review commentROCmSoftwarePlatform/AMDMIGraphX

Op constructor c/python api

+#include "json_tokenize.hpp"+#include <algorithm>+#include <string>+#include <vector>+#include <iostream>+#include <migraphx/errors.hpp>++namespace migraphx {+inline namespace MIGRAPHX_INLINE_NS {++siter colon(siter start, siter end)+{+    return std::find_if(start, end, [](auto c) { return c == ':'; });+}++std::pair<siter, siter> key(siter start, siter end)+{+    // find key end+    --end;+    while(start != end)+    {+        if(*end != ' ')+        {+            break;+        }+        --end;+    }+    auto ke = end;++    if(start == end)+    {+        if(*ke != '\"')+        {+            return {ke, ke};+        }+        else+        {+            MIGRAPHX_THROW("KEY: single quote cannot be a key!");+        }+    }++    // find key start+    --end;+    while(start != end)+    {+        // match+        if(*ke == '\"' and *end == '\"')

What if the quote is escaped? Then it shouldn't stop.

scxiao

comment created time in 6 days

PullRequestReviewEvent

Pull request review commentROCmSoftwarePlatform/AMDMIGraphX

Op constructor c/python api

+#include "json_tokenize.hpp"+#include <algorithm>+#include <string>+#include <vector>+#include <iostream>+#include <migraphx/errors.hpp>++namespace migraphx {+inline namespace MIGRAPHX_INLINE_NS {++siter colon(siter start, siter end)+{+    return std::find_if(start, end, [](auto c) { return c == ':'; });+}++std::pair<siter, siter> key(siter start, siter end)+{+    // find key end+    --end;+    while(start != end)+    {+        if(*end != ' ')+        {+            break;+        }+        --end;+    }+    auto ke = end;++    if(start == end)+    {+        if(*ke != '\"')+        {+            return {ke, ke};+        }+        else+        {+            MIGRAPHX_THROW("KEY: single quote cannot be a key!");+        }+    }++    // find key start+    --end;+    while(start != end)+    {+        // match+        if(*ke == '\"' and *end == '\"')+        {+            break;+        }+        else if(*ke != '\"' and (std::ispunct(*end) or *end == ' '))+        {+            ++end;+            break;+        }+        --end;+    }++    if(start == end)+    {+        if(std::ispunct(*end) or *end == ' ')+            ++end;+    }++    auto ks = end;++    return {ks, ke};+}++std::string json_tokenize(const std::string& s)+{+    siter start = s.begin();+    siter end   = s.end();+    std::vector<token> tokens;++    while(start != end)+    {+        auto colon_iter = colon(start, end);

Wont this find a colon in a quote?

scxiao

comment created time in 6 days

PullRequestReviewEvent

Pull request review commentROCmSoftwarePlatform/AMDMIGraphX

Op constructor c/python api

+#ifndef MIGRAPHX_GUARD_API_RTGLIB_JSONIZE_ATTR_STRING_HPP+#define MIGRAPHX_GUARD_API_RTGLIB_JSONIZE_ATTR_STRING_HPP++#include <string>+#include <migraphx/config.hpp>++namespace migraphx {+inline namespace MIGRAPHX_INLINE_NS {++using siter = std::string::const_iterator;+using token = std::pair<siter, siter>;++std::string json_tokenize(const std::string& s);

Maybe we can unit tests for this function(without the constructor API):

EXPECT(convert_json("{key: 1}") == "{\"key\": 1}");
scxiao

comment created time in 6 days

PullRequestReviewEvent

Pull request review commentROCmSoftwarePlatform/AMDMIGraphX

Op constructor c/python api

+#ifndef MIGRAPHX_GUARD_API_RTGLIB_JSONIZE_ATTR_STRING_HPP+#define MIGRAPHX_GUARD_API_RTGLIB_JSONIZE_ATTR_STRING_HPP++#include <string>+#include <migraphx/config.hpp>++namespace migraphx {+inline namespace MIGRAPHX_INLINE_NS {++using siter = std::string::const_iterator;+using token = std::pair<siter, siter>;++std::string json_tokenize(const std::string& s);

This should probably be called something like convert_json, as tokenize implies we would be returning tokens.

scxiao

comment created time in 6 days

PullRequestReviewEvent

Pull request review commentROCmSoftwarePlatform/AMDMIGraphX

Op constructor c/python api

+#ifndef MIGRAPHX_GUARD_API_RTGLIB_JSONIZE_ATTR_STRING_HPP+#define MIGRAPHX_GUARD_API_RTGLIB_JSONIZE_ATTR_STRING_HPP++#include <string>+#include <migraphx/config.hpp>++namespace migraphx {+inline namespace MIGRAPHX_INLINE_NS {++using siter = std::string::const_iterator;+using token = std::pair<siter, siter>;

These should just be moved to the .cpp file.

scxiao

comment created time in 6 days

PullRequestReviewEvent

create barnchROCmSoftwarePlatform/AMDMIGraphX

branch : reshape-attr

created branch time in 7 days

create barnchpfultz2/cppcheck

branch : fp-valueflow-ternary-before-condition

created branch time in 7 days

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul Fultz II

commit sha b83cd632f8f17e5d0a02d07a0f1a7e9f1f9b0d12

Improve API for program parameters (#635) * Take numpy array directly in python API * Formatting * Intialize program parameters from initializer list * Formatting Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

view details

Paul Fultz II

commit sha 3bbaf664e513d1e2496af01dcc8daef43d0f44cd

Bump version

view details

Shucai Xiao

commit sha 549402479940b2a089281bea836cae75b7d14a3e

Update ort commit to latest ort changes (#643) Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

view details

Paul Fultz II

commit sha 6835c2e000cf1739303e1786e9bcd06cd15d62a6

Merge branch 'develop' into check-shapes-op-name

view details

push time in 7 days

PullRequestReviewEvent

issue commentRadeonOpenCompute/ROCm

`rocm-clang-ocl` deb package has wrong dependencies or a bug

Clang-ocl package depends on rocm-opencl, but as opencl now uses comgr to compile instead of the compiler, the compiler is no longer provided as a dependency of opencl. Here is PR to fix that:

https://github.com/RadeonOpenCompute/clang-ocl/pull/32

baryluk

comment created time in 7 days

PR opened RadeonOpenCompute/clang-ocl

Add the compiler as a dependency

As opencl uses comgr for compilation, it no longer has the compiler as a dependency.

+2 -2

0 comment

1 changed file

pr created time in 7 days

create barnchRadeonOpenCompute/clang-ocl

branch : pfultz2-clang-dep

created branch time in 7 days

PullRequestReviewEvent

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul Fultz II

commit sha 3bbaf664e513d1e2496af01dcc8daef43d0f44cd

Bump version

view details

Paul Fultz II

commit sha 24377c0fca256b2a136f8ec4d07f1dd225c6f494

Merge branch 'develop' into update_dockerfile_ort_commit

view details

push time in 7 days

push eventROCmSoftwarePlatform/AMDMIGraphX

Paul Fultz II

commit sha 3bbaf664e513d1e2496af01dcc8daef43d0f44cd

Bump version

view details

push time in 7 days

Pull request review commentROCmSoftwarePlatform/AMDMIGraphX

Op constructor c/python api

+#include "jsonize_attr_string.hpp"+#include <vector>+#include <stack>+#include <iostream>+#include <algorithm>+#include <cassert>+#include <unordered_map>+#include <unordered_set>+#include <utility>+#include <migraphx/errors.hpp>++namespace migraphx {+inline namespace MIGRAPHX_INLINE_NS {++// get all elements of an array or an object, including '[]' and '{}'+std::string get_elements_string(const std::string& str, const std::size_t start, const char brkt)

Actually you could use something like this to tokenize the json string:


template<class Predicate>
auto lex_while(P p)
{
    return [=](const char* start, const char* end) {
        return std::find_if(start, end, [&](char c) {
            return not p(c);
        });
    };
}

template<class Predicate>
auto lex_if(P p)
{
    return [=](const char* start, const char* end) {
        if (p(*start))
            return start+1;
        return start;
    };
}

using token = std::pair<const char*, const char*>;
using lexer = std::function<const char*(const char* start, const char* end)>;

std::vector<token> tokenize(const char* start, const char* end, std::vector<lexer> lexers)
{
    std::vector<token> result;
    while(start != end) 
    {
        bool error = true;
        for(auto l:lexers) 
        {
            auto next = l(start, end);
            if (next != start)
            {
                if (not std::all_of(start, next, &isspace))
                    result.emplace_back(start, next);
                start = next;
                error = false;
                break;
            }
        }
        if (error)
            MIGRAPHX_THROW("Lex error");
    }
    return result;
}

std::vector<token> json_tokenize(const std::string& s)
{
    std::vector<lexer> lexers;
    // Quote
    lexers.push_back([](const char* start, const char* end) {
        if (*start != '\"')
            return start;
        while(start != end and *start != '\"')
        {
            if (*start == '\\')
                start++;
            start++;
        }
    });
    // Whitespace
    lexers.push_back(lex_while(&isspace));
    // Punctation
    lexers.push_back(lex_if(&ispunct));
    // Identifier/number
    lexers.push_back(lex_while([](char c) { return isalnum(c) or '_' or '.' or '+'; }));
    return tokenize(s.data(), s.data()+s.length(), lexers);
}

As we lex identifiers, we can add quotes to them if it doesn't start with a number, which should add quotes for keys or values. You could also extend the lexer to tokenize floating point and integers separately.

scxiao

comment created time in 8 days

PullRequestReviewEvent

Pull request review commentROCmSoftwarePlatform/AMDMIGraphX

Op constructor c/python api

+#include "jsonize_attr_string.hpp"+#include <vector>+#include <stack>+#include <iostream>+#include <algorithm>+#include <cassert>+#include <unordered_map>+#include <unordered_set>+#include <utility>+#include <migraphx/errors.hpp>++namespace migraphx {+inline namespace MIGRAPHX_INLINE_NS {++// get all elements of an array or an object, including '[]' and '{}'+std::string get_elements_string(const std::string& str, const std::size_t start, const char brkt)

We dont need to parse the json. We can just lex the json(and skip whitespace). When the next token is : then we can check if the previous token has quotes, if not then we add quotes while we serialize it to a string.

Each token can be an iterator(or pointer) pair so we dont have to make copies of the string as we lex.

scxiao

comment created time in 8 days

PullRequestReviewEvent
more