intel-graphics-compiler

mirror of https://github.com/intel/intel-graphics-compiler.git synced 2025-10-30 08:18:26 +08:00

Author	SHA1	Message	Date
sys_igc	09573028d0	[Autobackout][FunctionalRegression]Revert of change: `e640d20fc8`: Enable Code Scheduling on recompilation Enable Code Scheduling on recompilation	2025-07-26 06:22:53 +02:00
Pillow, Scott	74151dcc32	Remove unneeded assert Remove unneeded assert	2025-07-26 05:07:18 +02:00
Dmitrichenko, Aleksei	e640d20fc8	Enable Code Scheduling on recompilation Enable Code Scheduling on recompilation	2025-07-25 15:11:38 +02:00
Dmitrichenko, Aleksei	684ab05a6c	Improve CodeScheduling - Add caching for register pressure estimation, real uses computation and values size - Implement fragmentation-aware register pressure adjustment heuristic for large loads - Add new heuristic for prioritizing loads that unlock DPAS instructions - Fix initial register pressure estimation for hoisted loads and corresponding IEs in BBIn - Fix ftobf regpressure estimation - Some changes of the whole scheduling workflow to take advantage of the backtracking - Add new heuristic to put instructions between the load and the subsequent shuffling to hide latency	2025-07-25 13:25:04 +02:00
Kwasniewski, Patryk	f8ce0b6d52	use whole GRF for load 2D Block2d load's return size per block is multiple of GRFs. If the actual returned data per block is not multiple of GRFs, its size is rounded up to the next whole GRF with unused GRF storage filled with zero.	2025-07-25 13:11:23 +02:00
Kanclerz, Piotr	b224b29b4d	Remove workaround -Wno-error=implicit-int Previously, some workloads used implicit int types in function definitions. With LLVM 15, implicit ints are treated as errors. The workaround to disable this error has now been removed, so IGC enforces the same behavior as LLVM 15 and treats implicit ints as errors.	2025-07-25 11:13:24 +02:00
Pillow, Scott	38f7295c4a	Mark gradient intrinsics as convergent Mark gradient intrinsics as convergent	2025-07-25 02:20:46 +02:00
Y	9977bdf206	Add handling for opaque pointer of Function/FunctionType case in ProgramScopeConstantAnalysis The call to IGCLLVM::getNonOpaquePtrEltTy was failing when opaque pointers were enabled. I've introduced check and adequate handling of such scenario	2025-07-25 02:01:45 +02:00
Andrzejewski, Krystian	db24ad6d71	Use `clang-format` in auto-generated files This change is to format auto-generated files with `clang-format` using llvm code style.	2025-07-24 23:33:04 +02:00
Gu, Junjie	77cd23b537	Handle inline asm in vector alias 1. Improve vector alias optim to handle inline asm 2. Allow constant insert elements	2025-07-24 18:14:27 +02:00
Pawel Szymichowski	16fa7371f0	Synchronization between branched ---------------------------	2025-07-24 15:04:24 +02:00
Andrzejewski, Krystian	342c4fb729	Allow customizing function patch names This change is to allow the compiler to set a customized function patch names.	2025-07-23 13:47:37 +02:00
Kwasniewski, Patryk	c4c4eb33c6	use whole GRF for load 2D Block2d load's return size per block is multiple of GRFs. If the actual returned data per block is not multiple of GRFs, its size is rounded up to the next whole GRF with unused GRF storage filled with zero.	2025-07-23 13:08:32 +02:00
sys_igc	a30d372e95	[Autobackout][FunctionalRegression]Revert of change: `1a88d6ea34`: Adjust pre-RA scheduling heuristic Adjust pre-RA scheduling heuristic	2025-07-23 00:29:49 +02:00
sys_igc	f82224aa18	[Autobackout][FunctionalRegression]Revert of change: `61e417b80b`: Allow customizing function patch names This change is to allow the compiler to set a customized function patch names.	2025-07-22 22:16:28 +02:00
Andrzejewski, Krystian	61e417b80b	Allow customizing function patch names This change is to allow the compiler to set a customized fuction patch names.	2025-07-22 17:07:52 +02:00
Gojska, Grzegorz	bbdc8e9184	[IGC OCL] SYCL Joint Matrix enable 16-bit datatypes for C and D matrices. Enable 16-bit datatypes for accumulator and output matrices in joint matrix. Platforms: PVC, DG2 Keywords: Feature Related-to: GSD-11139 Resolves:	2025-07-22 16:41:26 +02:00
Andrzejewski, Krystian	33a954cf51	Remove redundant guard for a pattern of global imm offsets This simplifies the control flow for a pattern of global imm offsets.	2025-07-22 15:53:34 +02:00
Stefan Ilic	349f3e9d21	Disable SIMD16 drop heuristics for XE3 Disable heuristics due to regression.	2025-07-22 14:40:59 +02:00
Kreczko, Konrad	c8ee2afd36	Update docs Removed old information from docs and updated script snippets to be easily copyable for user experience.	2025-07-22 14:31:04 +02:00
Andrzejewski, Krystian	dac4abb69a	Expose a static function to identify `ContinuationHLIntrinsic` This function allows to identify `ContinuationHLIntrinsic` by the intrinsic id.	2025-07-22 13:04:41 +02:00
Garbowski, Mateusz	e7ecc4545a	Fix tryFindPointerOrigin assert Fix assert condition to better match the message and intention behind it.	2025-07-22 12:46:52 +02:00
Filipkowski, Lukasz	b63540c8a8	Update discard mask pattern match fix Update discard mask pattern match fix. Corrected predicate mode is set for discard branch.	2025-07-22 12:43:45 +02:00
sys_igc	14aa10abcc	[Autobackout][FunctionalRegression]Revert of change: `80068a026f`: Update discard mask pattern match fix Update discard mask pattern match fix. Corrected predicate mode is set for discard branch.	2025-07-22 04:00:09 +02:00
Cheng, Bu Qi	1a88d6ea34	Adjust pre-RA scheduling heuristic Adjust pre-RA scheduling heuristic	2025-07-21 23:13:03 +02:00
Gu, Junjie	b4641ae2ef	Fix and improve inline asm If inline asm's operands are aliases, the current code generate a copy if the operand is input; and does not handle aliased output operand. When using copy, it is a little tricky whether to use NoMask or not, especially for output operands. In addition, using inline asm is most likely for performance and additional copies should be avoided as much as possible. This change fixes output alias operands and also removes copies by generating visa alias decl with non-zero offset.	2025-07-21 21:42:38 +02:00
Ratajewski, Andrzej	a1f7a26a59	Fix subroutine handling for `intel_reqd_sub_group_size(32)` Previously, using `intel_reqd_sub_group_size(32)` on DG2 resulted in two redundant SIMD32 call instructions being generated in vISA, which could lead to unexpected issues. This change ensures that only a single SIMD32 call instruction is generated. All function arguments and return values are now correctly passed using two SIMD16 instructions, eliminating redundancy and improving	2025-07-21 13:58:39 +02:00
Alexander Heinecke	ce775e8c5e	Create fix_missing_cstdint_dev_gcc15.patch ---------------------------	2025-07-21 12:28:48 +02:00
Filipkowski, Lukasz	80068a026f	Update discard mask pattern match fix Update discard mask pattern match fix. Corrected predicate mode is set for discard branch.	2025-07-21 09:58:25 +02:00
Ashar, Pratik J	d56e799c0a	Reduce number of available colors in RA by # of reserved GRFs in fail safe In fail safe RA, we reserve some number of GRFs to guarantee RA termination. When GRFs are reserved, we must also reduce number of available colors when determining color ordering.	2025-07-21 07:47:09 +02:00
Paige, Alexander	420b632df9	Update IGC code format Update IGC code format v2.16.0	2025-07-20 06:20:11 +02:00
sys_igc	3976c0b30f	[Autobackout][FunctionalRegression]Revert of change: `c3e6c9d734`: Don't cache volatile load store instructions On platforms with default cache policy set to L1 and L3 cached such as DG2 or BMG volatile instructions are also cached. Since CUDA doesn't cache volatile pointers, there is a code that is not supported by Intel GPU, as caching volatile can lead to hangs.	2025-07-20 03:56:01 +02:00
Cheng, Bu Qi	b23eb1ef49	gather send update Gather send update	2025-07-19 10:40:25 +02:00
Pillow, Scott	d7e78d5e45	Update clang-format column limit Update clang-format column limit	2025-07-18 23:45:33 +02:00
Diana Chen	a794173711	IGA: minor indent fix For internal feature	2025-07-18 22:05:08 +02:00
Rybalov, Viacheslav G	323eca95f9	Support NaN in Bfloat MinMax resolution Fixing BfloatFuncsResolution pass to support NaN in MinMax resolution.	2025-07-18 14:51:18 +02:00
Skobejko, Milosz	c3e6c9d734	Don't cache volatile load store instructions On platforms with default cache policy set to L1 and L3 cached such as DG2 or BMG volatile instructions are also cached. Since CUDA doesn't cache volatile pointers, there is a code that is not supported by Intel GPU, as caching volatile can lead to hangs.	2025-07-18 14:43:40 +02:00
Sukhov, Egor	b98a2ba086	Only Rematerialize ptr bitcasts for function calls inside CloneAddressArithmetic Only Rematerialize ptr bitcasts for function calls inside CloneAddressArithmetic	2025-07-18 14:05:49 +02:00
Garbowski, Mateusz	92d6114445	Opaque pointer fixes in LegalizeFunctionSignatures, PromoteBools In LegalizeFunctionSignatures don't call `getFunction()` which returns parent function. Add support for llvm15+ which works with opaque pointers and a legacy llvm 14 path. In PromoteBools: - Call `getType()` on load instruction - calling `getType()` on src returns an opaque pointer. - Use getValueType() in promoteGlobalVariable to work with opaque pointers.	2025-07-18 12:51:53 +02:00
sys_igc	acae2f8d37	[Autobackout][FunctionalRegression]Revert of change: `df4a2a246d`: Fix subroutine handling for `intel_reqd_sub_group_size(32)` Previously, using `intel_reqd_sub_group_size(32)` on DG2 resulted in two redundant SIMD32 call instructions being generated in vISA, which could lead to unexpected issues. This change ensures that only a single SIMD32 call instruction is generated. All function arguments and return values are now correctly passed using two SIMD16 instructions, eliminating redundancy and improving	2025-07-18 11:47:49 +02:00
Michal Paszkowski	7fd0952833	Fix kernel_arg_base_type for OpenCL type arguments The metadata node !kernel_arg_base_type must mirror !kernel_arg_type for OpenCL builtin types (e.g. image1d_t). Unfortunately, this is inconsistent with LLVM 16-based Common Clang. This patch ensures that every OpenCL builtin type (*_t) listed in !kernel_arg_type is also present in !kernel_arg_base_type at the same position.	2025-07-18 05:45:28 +02:00
Chen, Kai	3c5f2266ea	Re-enable dst for lifetime_start in resourec loop header Re-enable dst for lifetime_start in resourec loop header. Set the condition to benefit compatible workloads.	2025-07-18 05:21:44 +02:00
Cheng, Bu Qi	82a1986c0a	RA change _OS_DESCRIPTION	2025-07-18 02:38:09 +02:00
Diana Chen	d2d30ed8fc	IGA: Update GEDLibrary version For internal feature	2025-07-18 02:05:25 +02:00
sys_igc	e2e261bd7e	[Autobackout][FunctionalRegression]Revert of change: `890b8bf021`: Bitcast in StatelessToStateful pass Add support for bitcast instruction into StatelessToStateful pass.	2025-07-18 00:23:49 +02:00
Michal Paszkowski	bb1ad498e6	Retype TargetExtTy return types of function declarations Clang 16 still lowers OpenCL/SPIR-V built-ins as ptr to opaque structs, while SPIR-V Reader uses TargetExtTy values. This patch extends the retyping function to also retype the return types of function (builtins) declarations. Please note that the builtin function resolution is already done earlier by SPIR-V Reader. This patch also changes how ImageFuncsAnalysis pass recognizes image/sampler types. Now, instead of relying on pointer element types, the pass uses IGC metadata (m_OpenCLArgBaseTypes) -- consistent with other passes later on in the pipeline.	2025-07-17 20:38:16 +02:00
Ratajewski, Andrzej	df4a2a246d	Fix subroutine handling for `intel_reqd_sub_group_size(32)` Previously, using `intel_reqd_sub_group_size(32)` on DG2 resulted in two redundant SIMD32 call instructions being generated in vISA, which could lead to unexpected issues. This change ensures that only a single SIMD32 call instruction is generated. All function arguments and return values are now correctly passed using two SIMD16 instructions, eliminating redundancy and improving	2025-07-17 10:42:38 +02:00
sys_igc	95429e8897	[Autobackout][FunctionalRegression]Revert of change: `cee4ac4e9e`: Retype TargetExtTy return types of function declarations Clang 16 still lowers OpenCL/SPIR-V built-ins as ptr to opaque structs, while SPIR-V Reader uses TargetExtTy values. This patch extends the retyping function to also retype the return types of function (builtins) declarations. Please note that the builtin function resolution is already done earlier by SPIR-V Reader. This patch also changes how ImageFuncsAnalysis pass recognizes image/sampler types. Now, instead of relying on pointer element types, the pass uses IGC metadata (m_OpenCLArgType) -- consistent with other passes later on in the pipeline.	2025-07-17 07:27:44 +02:00
Gu, Junjie	2284c57ba1	Apply clang-format Apply clang-format, no functional change	2025-07-17 02:50:09 +02:00
sys_igc	0b39832e8e	[Autobackout][FunctionalRegression]Revert of change: `0e204e2473`: Enable SIMD16 drop for more platforms Enable abort on spills to SIMD16 for more platforms.	2025-07-17 01:56:16 +02:00

... 2 3 4 5 6 ...

18372 Commits