12.1. OpenCL Library - Intel
文章推薦指數: 80 %
An OpenCL™ library is a single file that contains multiple functions. Each function is comprised of data processing logic that works at any clock frequency. You ... SkipToMainContent 12.1.OpenCLLibrary Thebrowserversionyouareusingisnotrecommendedforthissite.Pleaseconsiderupgradingtothelatestversionofyourbrowserbyclickingoneofthefollowinglinks. Safari Chrome Edge Firefox Intel®FPGASDKforOpenCL™StandardEdition:ProgrammingGuide Download Bookmark ID 683342 Date 4/22/2019 Version 18.1(latest) 18-0 Public ViewMore SeeLess VisibletoIntelonly— GUID: ion1521051240027 Ixiasoft ViewDetails CloseFilterModal DocumentTableofContents DocumentTableofContents 1.Intel®FPGASDKforOpenCL™StandardEditionOverview 2.Intel®FPGASDKforOpenCL™OfflineCompilerKernelCompilationFlows 3.ObtainingGeneralInformationonSoftware,Compiler,andCustomPlatform 4.ManaginganFPGABoard 5.StructuringYourOpenCLKernel 6.DesigningYourHostApplication 7.CompilingYourOpenCLKernel 8.EmulatingandDebuggingYourOpenCLKernel 9.ReviewingYourKernel'sreport.htmlFile 10.ProfilingYourOpenCLKernel 11.DevelopingOpenCL™ApplicationsUsingIntel®CodeBuilderforOpenCL™ 12.Intel®FPGASDKforOpenCL™StandardEditionAdvancedFeatures A.SupportStatusesofOpenCLFeatures B.DocumentRevisionHistoryoftheIntel®FPGASDKforOpenCL™StandardEditionProgrammingGuide 1.Intel®FPGASDKforOpenCL™StandardEditionOverview 1.1.Intel®FPGASDKforOpenCL™StandardEditionProgrammingGuidePrerequisites 1.2.Intel®FPGASDKforOpenCL™StandardEditionFPGAProgrammingFlow 2.Intel®FPGASDKforOpenCL™OfflineCompilerKernelCompilationFlows 2.1.One-StepCompilationforSimpleKernels 2.2.MultistepIntel®FPGASDKforOpenCL™StandardEditionDesignFlow 3.ObtainingGeneralInformationonSoftware,Compiler,andCustomPlatform 3.1.DisplayingtheSoftwareVersion(version) 3.2.DisplayingtheCompilerVersion(-version) 3.3.ListingtheIntel®FPGASDKforOpenCL™StandardEditionUtilityCommandOptions(help) 3.4.ListingtheIntel®FPGASDKforOpenCL™OfflineCompilerCommandOptions(noargument,-help,or-h) 3.5.ListingtheAvailableFPGABoardsinYourCustomPlatform(-list-boards) 3.6.DisplayingtheCompilationEnvironmentofanOpenCLBinary(env) 3.3.ListingtheIntel®FPGASDKforOpenCL™StandardEditionUtilityCommandOptions(help) 3.3.1.DisplayingInformationonanIntel®FPGASDKforOpenCL™UtilityCommandOption(help) 4.ManaginganFPGABoard 4.1.InstallinganFPGABoard(install) 4.2.UninstallingtheFPGABoard(uninstall) 4.3.QueryingtheDeviceNameofYourFPGABoard(diagnose) 4.4.RunningaBoardDiagnosticTest(diagnose) 4.5.ProgrammingtheFPGAOfflineorwithoutaHost(program) 4.6.ProgrammingtheFlashMemory(flash) 5.StructuringYourOpenCLKernel 5.1.GuidelinesforNamingtheKernel 5.2.ProgrammingStrategiesforOptimizingDataProcessingEfficiency 5.3.ProgrammingStrategiesforOptimizingPointer-to-LocalMemorySize 5.4.ImplementingtheIntel®FPGASDKforOpenCL™StandardEditionChannelsExtension 5.5.ImplementingOpenCLPipes 5.6.ImplementingArbitraryPrecisionIntegers 5.7.UsingPredefinedPreprocessorMacrosinConditionalCompilation 5.8.Declaring__constantAddressSpaceQualifiers 5.9.IncludingStructureDataTypesasArgumentsinOpenCLKernels 5.10.InferringaRegister 5.11.EnablingDoublePrecisionFloating-PointOperations 5.12.Single-CycleFloating-PointAccumulatorforSingleWork-ItemKernels 5.2.ProgrammingStrategiesforOptimizingDataProcessingEfficiency 5.2.1.UnrollingaLoop 5.2.2.CoalescingNestedLoops 5.2.3.SpecifyingaLoopInitiationinterval(II) 5.2.4.LoopConcurrency(max_concurrencyPragma) 5.2.5.SpecifyingWork-GroupSizes 5.2.6.SpecifyingNumberofComputeUnits 5.2.7.SpecifyingNumberofSIMDWork-Items 5.4.ImplementingtheIntel®FPGASDKforOpenCL™StandardEditionChannelsExtension 5.4.1.OverviewoftheIntel®FPGASDKforOpenCL™StandardEditionChannelsExtension 5.4.2.ChannelDataBehavior 5.4.3.MultipleWork-ItemOrderingforChannels 5.4.4.RestrictionsintheImplementationofIntel®FPGASDKforOpenCL™StandardEditionChannelsExtension 5.4.5.EnablingtheIntel®FPGASDKforOpenCL™StandardEditionChannelsforOpenCLKernel 5.4.3.MultipleWork-ItemOrderingforChannels 5.4.3.1.Work-ItemSerialExecutionofChannels 5.4.5.EnablingtheIntel®FPGASDKforOpenCL™StandardEditionChannelsforOpenCLKernel 5.4.5.1.DeclaringtheChannelHandle 5.4.5.2.ImplementingBlockingChannelWrites 5.4.5.3.ImplementingBlockingChannelReads 5.4.5.4.ImplementingI/OChannelsUsingtheioChannelsAttribute 5.4.5.5.EmulatingI/OChannels 5.4.5.6.UseModelsofIntel®FPGASDKforOpenCL™StandardEditionChannelsImplementation 5.4.5.7.ImplementingBufferedChannelsUsingthedepthChannelsAttribute 5.4.5.8.EnforcingtheOrderofChannelCalls 5.4.5.2.ImplementingBlockingChannelWrites 5.4.5.2.1.ImplementingNonblockingChannelWrites 5.4.5.3.ImplementingBlockingChannelReads 5.4.5.3.1.ImplementingNonblockingChannelReads 5.4.5.8.EnforcingtheOrderofChannelCalls 5.4.5.8.1.DefiningMemoryConsistencyAcrossKernelsWhenUsingChannels 5.5.ImplementingOpenCLPipes 5.5.1.OverviewoftheOpenCLPipeFunctions 5.5.2.PipeDataBehavior 5.5.3.MultipleWork-ItemOrderingforPipes 5.5.4.RestrictionsinOpenCLPipesImplementation 5.5.5.EnablingOpenCLPipesforKernels 5.5.6.DirectCommunicationwithKernelsviaHostPipes 5.5.3.MultipleWork-ItemOrderingforPipes 5.5.3.1.Work-ItemSerialExecutionofPipes 5.5.5.EnablingOpenCLPipesforKernels 5.5.5.1.EnsuringCompatibilitywithOtherOpenCLSDKs 5.5.5.2.DeclaringthePipeHandle 5.5.5.3.ImplementingPipeWrites 5.5.5.4.ImplementingPipeReads 5.5.5.5.ImplementingBufferedPipesUsingthedepthAttribute 5.5.5.6.ImplementingI/OPipesUsingtheioAttribute 5.5.5.7.EnforcingtheOrderofPipeCalls 5.5.5.7.EnforcingtheOrderofPipeCalls 5.5.5.7.1.DefiningMemoryConsistencyAcrossKernelsWhenUsingPipes 5.5.6.DirectCommunicationwithKernelsviaHostPipes 5.5.6.1.Optionalintel_host_accessibleKernelArgumentAttribute 5.5.6.2.APIFunctionsforInteractingwithcl_memPipeObjectsBoundtoHost-AccessiblePipeKernelArguments 5.5.6.3.CreatingaHostAccessiblePipe 5.5.6.4.ExampleUseofthecl_intel_fpga_host_pipeExtension 5.9.IncludingStructureDataTypesasArgumentsinOpenCLKernels 5.9.1.MatchingDataLayoutsofHostandKernelStructureDataTypes 5.9.2.DisablingInsertionofDataStructurePadding 5.9.3.SpecifyingtheAlignmentofaStruct 5.10.InferringaRegister 5.10.1.InferringaShiftRegister 5.12.Single-CycleFloating-PointAccumulatorforSingleWork-ItemKernels 5.12.1.ProgrammingStrategiesforInferringtheAccumulator 6.DesigningYourHostApplication 6.1.HostProgrammingRequirements 6.2.AllocatingOpenCLBuffersforManualPartitioningofGlobalMemory 6.3.CollectingProfileDataDuringKernelExecution 6.4.AccessingCustomPlatform-SpecificFunctions 6.5.ModifyingHostProgramforStructureParameterConversion 6.6.ManagingHostApplication 6.7.AllocatingSharedMemoryforOpenCLKernelsTargetingSoCs 6.8.DebuggingYourOpenCLSystemThatisGraduallySlowingDown 6.1.HostProgrammingRequirements 6.1.1.HostMachineMemoryRequirements 6.1.2.HostBinaryRequirement 6.1.3.MultipleHostThreads 6.1.4.Out-of-OrderCommandQueues 6.1.5.RequirementforMultipleCommandQueuestoExecuteKernelsConcurrently 6.2.AllocatingOpenCLBuffersforManualPartitioningofGlobalMemory 6.2.1.PartitioningBuffersAcrossMultipleInterfacesoftheSameMemoryType 6.2.2.PartitioningBuffersAcrossDifferentMemoryTypes(HeterogeneousMemory) 6.2.3.CreatingaPipeObjectinYourHostApplication 6.3.CollectingProfileDataDuringKernelExecution 6.3.1.ProfilingEnqueuedandAutorunKernels 6.3.2.ProfileDataAcquisition 6.3.3.MultipleAutorunProfilingCalls 6.6.ManagingHostApplication 6.6.1.DisplayingExampleMakefileFragments(example-makefileormakefile) 6.6.2.CompilingandLinkingYourHostApplication 6.6.3.LinkingYourHostApplicationtotheKhronosICDLoaderLibrary 6.6.4.ProgramminganFPGAviatheHost 6.6.5.TerminationoftheRuntimeEnvironmentandErrorRecovery 6.6.2.CompilingandLinkingYourHostApplication 6.6.2.1.DisplayingFlagsforCompilingHostApplication(compile-config) 6.6.2.2.DisplayingPathstoOpenCLHostRuntimeandMMDLibraries(ldflags) 6.6.2.3.ListingOpenCLHostRuntimeandMMDLibraries(ldlibs) 6.6.2.4.DisplayingInformationonOpenCLHostRuntimeandMMDLibraries(link-configorlinkflags) 6.6.3.LinkingYourHostApplicationtotheKhronosICDLoaderLibrary 6.6.3.1.LinkingtotheICDLoaderLibraryonWindows 6.6.3.2.LinkingtotheICDLoaderLibraryonLinux 6.6.4.ProgramminganFPGAviatheHost 6.6.4.1.ProgrammingMultipleFPGADevices 6.6.4.1.ProgrammingMultipleFPGADevices 6.6.4.1.1.ProbingtheOpenCLFPGADevices 6.6.4.1.2.QueryingDeviceInformation 6.6.4.1.3.LoadingKernelsforMultipleFPGADevices 7.CompilingYourOpenCLKernel 7.1.CompilingYourKerneltoCreateHardwareConfigurationFile 7.2.CompilingYourKernelwithoutBuildingHardware(-c) 7.3.SpecifyingtheLocationofHeaderFiles(-I=) 7.4.SpecifyingtheNameofanIntel®FPGASDKforOpenCL™OfflineCompilerOutputFile(-o=) 7.5.CompilingaKernelforaSpecificFPGABoard(-board=) 7.6.ResolvingHardwareGenerationFittingErrorsduringKernelCompilation(-high-effort) 7.7.DefiningPreprocessorMacrostoSpecifyKernelParameters(-D) 7.8.GeneratingCompilationProgressReport(-v) 7.9.DisplayingtheEstimatedResourceUsageSummaryOn-Screen(-report) 7.10.SuppressingWarningMessagesfromtheIntel®FPGASDKforOpenCL™OfflineCompiler(-W) 7.11.ConvertingWarningMessagesfromtheIntel®FPGASDKforOpenCL™OfflineCompilerintoErrorMessages(-Werror) 7.12.RemovingDebugDatafromCompilerReportsandSourceCodefromthe.aocxFile(-g0) 7.13.DisablingBurst-InterleavingofGlobalMemory(-no-interleaving=) 7.14.ConfiguringConstantMemoryCacheSize(-const-cache-bytes=) 7.15.RelaxingtheOrderofFloating-PointOperations(-fp-relaxed) 7.16.ReducingFloating-PointRoundingOperations(-fpc) 8.EmulatingandDebuggingYourOpenCLKernel 8.1.ModifyingChannelsKernelCodeforEmulation 8.2.CompilingaKernelforEmulation(-march=emulator) 8.3.EmulatingYourOpenCLKernel 8.4.DebuggingYourOpenCLKernelonLinux 8.5.LimitationsoftheIntel®FPGASDKforOpenCL™StandardEditionEmulator 8.6.DiscrepanciesinHardwareandEmulatorResults 8.1.ModifyingChannelsKernelCodeforEmulation 8.1.1.EmulatingaKernelthatPassesPipesorChannelsbyReference 8.1.2.EmulatingChannelDepth 10.ProfilingYourOpenCLKernel 10.1.InstrumentingtheKernelPipelinewithPerformanceCounters(-profile) 10.2.LaunchingtheIntel®FPGADynamicProfilerforOpenCL™GUI(report) 10.3.ProfilingAutorunKernels 11.DevelopingOpenCL™ApplicationsUsingIntel®CodeBuilderforOpenCL™ 11.1.ConfiguringtheIntel®CodeBuilderforOpenCL™OfflineCompilerPlug-inforMicrosoftVisualStudio 11.2.ConfiguringtheIntel®CodeBuilderforOpenCL™OfflineCompilerPlug-inforEclipse 11.3.CreatingaSessionintheIntel®CodeBuilderforOpenCL™ 11.4.ConfiguringaSession 12.Intel®FPGASDKforOpenCL™StandardEditionAdvancedFeatures 12.1.OpenCLLibrary 12.2.KernelAttributesforConfiguringLocalandPrivateMemorySystems 12.3.KernelAttributesforReducingtheOverheadonHardwareUsage 12.4.KernelReplicationUsingthenum_compute_units(X,Y,Z)Attribute 12.1.OpenCLLibrary 12.1.1.UnderstandingRTLModulesandtheOpenCLPipeline 12.1.2.PackaginganOpenCLHelperFunctionFileforanOpenCLLibrary 12.1.3.PackaginganRTLComponentforanOpenCLLibrary 12.1.4.VerifyingtheRTLModules 12.1.5.PackagingMultipleObjectFilesintoaLibraryFile 12.1.6.SpecifyinganOpenCLLibrarywhenCompilinganOpenCLKernel 12.1.7.UsinganOpenCLLibrarythatWorkswithSimpleFunctions(Example1) 12.1.8.UsinganOpenCLLibrarythatWorkswithExternalMemory(Example2) 12.1.9.OpenCLLibraryCommand-LineOptions 12.1.1.UnderstandingRTLModulesandtheOpenCLPipeline 12.1.1.1.Overview:IntelFPGASDKforOpenCLPipelineApproach 12.1.1.2.IntegrationofanRTLModuleintotheIntelFPGASDKforOpenCLPipeline 12.1.1.3.Stall-FreeRTL 12.1.1.4.RTLModuleInterfaces 12.1.1.5.AvalonStreaming(Avalon-ST)Interface 12.1.1.6.RTLResetandClockSignals 12.1.1.7.XMLSyntaxofanRTLModule 12.1.1.8.InteractionbetweenRTLModuleandExternalMemory 12.1.1.9.OrderofThreadsEnteringanRTLModule 12.1.1.10.OpenCLCModelofanRTLModule 12.1.1.11.PotentialIncompatibilitybetweenRTLModulesandPartialReconfiguration 12.1.1.7.XMLSyntaxofanRTLModule 12.1.1.7.1.XMLElementsforATTRIBUTES 12.1.1.7.2.XMLElementsforINTERFACE 12.1.1.7.3.XMLElementsforRESOURCES 12.1.3.PackaginganRTLComponentforanOpenCLLibrary 12.1.3.1.RestrictionsandLimitationsinRTLSupportfortheIntel®FPGASDKforOpenCL™StandardEditionLibraryFeature 12.2.KernelAttributesforConfiguringLocalandPrivateMemorySystems 12.2.1.RestrictionsontheUsageofVariable-SpecificAttributes 12.3.KernelAttributesforReducingtheOverheadonHardwareUsage 12.3.1.HardwareforKernelInterface 12.3.1.HardwareforKernelInterface 12.3.1.1.OmitHardwarethatGeneratesandDispatchesKernelIDs 12.3.1.2.OmitCommunicationHardwarebetweentheHostandtheKernel 12.4.KernelReplicationUsingthenum_compute_units(X,Y,Z)Attribute 12.4.1.CustomizationofReplicatedKernelsUsingtheget_compute_id()Function 12.4.2.UsingChannelswithKernelCopies A.SupportStatusesofOpenCLFeatures A.1.SupportStatusesofOpenCL1.0Features A.2.SupportStatusesofOpenCL1.2Features A.3.SupportStatusesofOpenCL2.0Features A.4.Intel®FPGASDKforOpenCL™StandardEditionAllocationLimits A.1.SupportStatusesofOpenCL1.0Features A.1.1.OpenCL1.0CProgrammingLanguageImplementation A.1.2.OpenCLCProgrammingLanguageRestrictions A.1.3.ArgumentTypesforBuilt-inGeometricFunctions A.1.4.NumericalComplianceImplementation A.1.5.ImageAddressingandFilteringImplementation A.1.6.AtomicFunctions A.1.7.EmbeddedProfileImplementation A.2.SupportStatusesofOpenCL1.2Features A.2.1.OpenCL1.2RuntimeImplementation A.2.2.OpenCL1.2CProgrammingLanguageImplementation A.3.SupportStatusesofOpenCL2.0Features A.3.1.OpenCL2.0Headers A.3.2.OpenCL2.0RuntimeImplementation A.3.3.OpenCL2.0CProgrammingLanguageRestrictionsforPipes Introduction CloseFilterModal 1.Intel®FPGASDKforOpenCL™StandardEditionOverview 1.1.Intel®FPGASDKforOpenCL™StandardEditionProgrammingGuidePrerequisites 1.2.Intel®FPGASDKforOpenCL™StandardEditionFPGAProgrammingFlow 2.Intel®FPGASDKforOpenCL™OfflineCompilerKernelCompilationFlows 2.1.One-StepCompilationforSimpleKernels 2.2.MultistepIntel®FPGASDKforOpenCL™StandardEditionDesignFlow 3.ObtainingGeneralInformationonSoftware,Compiler,andCustomPlatform 3.1.DisplayingtheSoftwareVersion(version) 3.2.DisplayingtheCompilerVersion(-version) 3.3.ListingtheIntel®FPGASDKforOpenCL™StandardEditionUtilityCommandOptions(help) 3.3.1.DisplayingInformationonanIntel®FPGASDKforOpenCL™UtilityCommandOption(help) 3.4.ListingtheIntel®FPGASDKforOpenCL™OfflineCompilerCommandOptions(noargument,-help,or-h) 3.5.ListingtheAvailableFPGABoardsinYourCustomPlatform(-list-boards) 3.6.DisplayingtheCompilationEnvironmentofanOpenCLBinary(env) 4.ManaginganFPGABoard 4.1.InstallinganFPGABoard(install) 4.2.UninstallingtheFPGABoard(uninstall) 4.3.QueryingtheDeviceNameofYourFPGABoard(diagnose) 4.4.RunningaBoardDiagnosticTest(diagnose) 4.5.ProgrammingtheFPGAOfflineorwithoutaHost(program) 4.6.ProgrammingtheFlashMemory(flash) 5.StructuringYourOpenCLKernel 5.1.GuidelinesforNamingtheKernel 5.2.ProgrammingStrategiesforOptimizingDataProcessingEfficiency 5.2.1.UnrollingaLoop 5.2.2.CoalescingNestedLoops 5.2.3.SpecifyingaLoopInitiationinterval(II) 5.2.4.LoopConcurrency(max_concurrencyPragma) 5.2.5.SpecifyingWork-GroupSizes 5.2.6.SpecifyingNumberofComputeUnits 5.2.7.SpecifyingNumberofSIMDWork-Items 5.3.ProgrammingStrategiesforOptimizingPointer-to-LocalMemorySize 5.4.ImplementingtheIntel®FPGASDKforOpenCL™StandardEditionChannelsExtension 5.4.1.OverviewoftheIntel®FPGASDKforOpenCL™StandardEditionChannelsExtension 5.4.2.ChannelDataBehavior 5.4.3.MultipleWork-ItemOrderingforChannels 5.4.3.1.Work-ItemSerialExecutionofChannels 5.4.4.RestrictionsintheImplementationofIntel®FPGASDKforOpenCL™StandardEditionChannelsExtension 5.4.5.EnablingtheIntel®FPGASDKforOpenCL™StandardEditionChannelsforOpenCLKernel 5.4.5.1.DeclaringtheChannelHandle 5.4.5.2.ImplementingBlockingChannelWrites 5.4.5.2.1.ImplementingNonblockingChannelWrites 5.4.5.3.ImplementingBlockingChannelReads 5.4.5.3.1.ImplementingNonblockingChannelReads 5.4.5.4.ImplementingI/OChannelsUsingtheioChannelsAttribute 5.4.5.5.EmulatingI/OChannels 5.4.5.6.UseModelsofIntel®FPGASDKforOpenCL™StandardEditionChannelsImplementation 5.4.5.7.ImplementingBufferedChannelsUsingthedepthChannelsAttribute 5.4.5.8.EnforcingtheOrderofChannelCalls 5.4.5.8.1.DefiningMemoryConsistencyAcrossKernelsWhenUsingChannels 5.5.ImplementingOpenCLPipes 5.5.1.OverviewoftheOpenCLPipeFunctions 5.5.2.PipeDataBehavior 5.5.3.MultipleWork-ItemOrderingforPipes 5.5.3.1.Work-ItemSerialExecutionofPipes 5.5.4.RestrictionsinOpenCLPipesImplementation 5.5.5.EnablingOpenCLPipesforKernels 5.5.5.1.EnsuringCompatibilitywithOtherOpenCLSDKs 5.5.5.2.DeclaringthePipeHandle 5.5.5.3.ImplementingPipeWrites 5.5.5.4.ImplementingPipeReads 5.5.5.5.ImplementingBufferedPipesUsingthedepthAttribute 5.5.5.6.ImplementingI/OPipesUsingtheioAttribute 5.5.5.7.EnforcingtheOrderofPipeCalls 5.5.5.7.1.DefiningMemoryConsistencyAcrossKernelsWhenUsingPipes 5.5.6.DirectCommunicationwithKernelsviaHostPipes 5.5.6.1.Optionalintel_host_accessibleKernelArgumentAttribute 5.5.6.2.APIFunctionsforInteractingwithcl_memPipeObjectsBoundtoHost-AccessiblePipeKernelArguments 5.5.6.3.CreatingaHostAccessiblePipe 5.5.6.4.ExampleUseofthecl_intel_fpga_host_pipeExtension 5.6.ImplementingArbitraryPrecisionIntegers 5.7.UsingPredefinedPreprocessorMacrosinConditionalCompilation 5.8.Declaring__constantAddressSpaceQualifiers 5.9.IncludingStructureDataTypesasArgumentsinOpenCLKernels 5.9.1.MatchingDataLayoutsofHostandKernelStructureDataTypes 5.9.2.DisablingInsertionofDataStructurePadding 5.9.3.SpecifyingtheAlignmentofaStruct 5.10.InferringaRegister 5.10.1.InferringaShiftRegister 5.11.EnablingDoublePrecisionFloating-PointOperations 5.12.Single-CycleFloating-PointAccumulatorforSingleWork-ItemKernels 5.12.1.ProgrammingStrategiesforInferringtheAccumulator 6.DesigningYourHostApplication 6.1.HostProgrammingRequirements 6.1.1.HostMachineMemoryRequirements 6.1.2.HostBinaryRequirement 6.1.3.MultipleHostThreads 6.1.4.Out-of-OrderCommandQueues 6.1.5.RequirementforMultipleCommandQueuestoExecuteKernelsConcurrently 6.2.AllocatingOpenCLBuffersforManualPartitioningofGlobalMemory 6.2.1.PartitioningBuffersAcrossMultipleInterfacesoftheSameMemoryType 6.2.2.PartitioningBuffersAcrossDifferentMemoryTypes(HeterogeneousMemory) 6.2.3.CreatingaPipeObjectinYourHostApplication 6.3.CollectingProfileDataDuringKernelExecution 6.3.1.ProfilingEnqueuedandAutorunKernels 6.3.2.ProfileDataAcquisition 6.3.3.MultipleAutorunProfilingCalls 6.4.AccessingCustomPlatform-SpecificFunctions 6.5.ModifyingHostProgramforStructureParameterConversion 6.6.ManagingHostApplication 6.6.1.DisplayingExampleMakefileFragments(example-makefileormakefile) 6.6.2.CompilingandLinkingYourHostApplication 6.6.2.1.DisplayingFlagsforCompilingHostApplication(compile-config) 6.6.2.2.DisplayingPathstoOpenCLHostRuntimeandMMDLibraries(ldflags) 6.6.2.3.ListingOpenCLHostRuntimeandMMDLibraries(ldlibs) 6.6.2.4.DisplayingInformationonOpenCLHostRuntimeandMMDLibraries(link-configorlinkflags) 6.6.3.LinkingYourHostApplicationtotheKhronosICDLoaderLibrary 6.6.3.1.LinkingtotheICDLoaderLibraryonWindows 6.6.3.2.LinkingtotheICDLoaderLibraryonLinux 6.6.4.ProgramminganFPGAviatheHost 6.6.4.1.ProgrammingMultipleFPGADevices 6.6.4.1.1.ProbingtheOpenCLFPGADevices 6.6.4.1.2.QueryingDeviceInformation 6.6.4.1.3.LoadingKernelsforMultipleFPGADevices 6.6.5.TerminationoftheRuntimeEnvironmentandErrorRecovery 6.7.AllocatingSharedMemoryforOpenCLKernelsTargetingSoCs 6.8.DebuggingYourOpenCLSystemThatisGraduallySlowingDown 7.CompilingYourOpenCLKernel 7.1.CompilingYourKerneltoCreateHardwareConfigurationFile 7.2.CompilingYourKernelwithoutBuildingHardware(-c) 7.3.SpecifyingtheLocationofHeaderFiles(-I=) 7.4.SpecifyingtheNameofanIntel®FPGASDKforOpenCL™OfflineCompilerOutputFile(-o=) 7.5.CompilingaKernelforaSpecificFPGABoard(-board=) 7.6.ResolvingHardwareGenerationFittingErrorsduringKernelCompilation(-high-effort) 7.7.DefiningPreprocessorMacrostoSpecifyKernelParameters(-D) 7.8.GeneratingCompilationProgressReport(-v) 7.9.DisplayingtheEstimatedResourceUsageSummaryOn-Screen(-report) 7.10.SuppressingWarningMessagesfromtheIntel®FPGASDKforOpenCL™OfflineCompiler(-W) 7.11.ConvertingWarningMessagesfromtheIntel®FPGASDKforOpenCL™OfflineCompilerintoErrorMessages(-Werror) 7.12.RemovingDebugDatafromCompilerReportsandSourceCodefromthe.aocxFile(-g0) 7.13.DisablingBurst-InterleavingofGlobalMemory(-no-interleaving=) 7.14.ConfiguringConstantMemoryCacheSize(-const-cache-bytes=) 7.15.RelaxingtheOrderofFloating-PointOperations(-fp-relaxed) 7.16.ReducingFloating-PointRoundingOperations(-fpc) 8.EmulatingandDebuggingYourOpenCLKernel 8.1.ModifyingChannelsKernelCodeforEmulation 8.1.1.EmulatingaKernelthatPassesPipesorChannelsbyReference 8.1.2.EmulatingChannelDepth 8.2.CompilingaKernelforEmulation(-march=emulator) 8.3.EmulatingYourOpenCLKernel 8.4.DebuggingYourOpenCLKernelonLinux 8.5.LimitationsoftheIntel®FPGASDKforOpenCL™StandardEditionEmulator 8.6.DiscrepanciesinHardwareandEmulatorResults 9.ReviewingYourKernel'sreport.htmlFile 10.ProfilingYourOpenCLKernel 10.1.InstrumentingtheKernelPipelinewithPerformanceCounters(-profile) 10.2.LaunchingtheIntel®FPGADynamicProfilerforOpenCL™GUI(report) 10.3.ProfilingAutorunKernels 11.DevelopingOpenCL™ApplicationsUsingIntel®CodeBuilderforOpenCL™ 11.1.ConfiguringtheIntel®CodeBuilderforOpenCL™OfflineCompilerPlug-inforMicrosoftVisualStudio 11.2.ConfiguringtheIntel®CodeBuilderforOpenCL™OfflineCompilerPlug-inforEclipse 11.3.CreatingaSessionintheIntel®CodeBuilderforOpenCL™ 11.4.ConfiguringaSession 12.Intel®FPGASDKforOpenCL™StandardEditionAdvancedFeatures 12.1.OpenCLLibrary 12.1.1.UnderstandingRTLModulesandtheOpenCLPipeline 12.1.1.1.Overview:IntelFPGASDKforOpenCLPipelineApproach 12.1.1.2.IntegrationofanRTLModuleintotheIntelFPGASDKforOpenCLPipeline 12.1.1.3.Stall-FreeRTL 12.1.1.4.RTLModuleInterfaces 12.1.1.5.AvalonStreaming(Avalon-ST)Interface 12.1.1.6.RTLResetandClockSignals 12.1.1.7.XMLSyntaxofanRTLModule 12.1.1.7.1.XMLElementsforATTRIBUTES 12.1.1.7.2.XMLElementsforINTERFACE 12.1.1.7.3.XMLElementsforRESOURCES 12.1.1.8.InteractionbetweenRTLModuleandExternalMemory 12.1.1.9.OrderofThreadsEnteringanRTLModule 12.1.1.10.OpenCLCModelofanRTLModule 12.1.1.11.PotentialIncompatibilitybetweenRTLModulesandPartialReconfiguration 12.1.2.PackaginganOpenCLHelperFunctionFileforanOpenCLLibrary 12.1.3.PackaginganRTLComponentforanOpenCLLibrary 12.1.3.1.RestrictionsandLimitationsinRTLSupportfortheIntel®FPGASDKforOpenCL™StandardEditionLibraryFeature 12.1.4.VerifyingtheRTLModules 12.1.5.PackagingMultipleObjectFilesintoaLibraryFile 12.1.6.SpecifyinganOpenCLLibrarywhenCompilinganOpenCLKernel 12.1.7.UsinganOpenCLLibrarythatWorkswithSimpleFunctions(Example1) 12.1.8.UsinganOpenCLLibrarythatWorkswithExternalMemory(Example2) 12.1.9.OpenCLLibraryCommand-LineOptions 12.2.KernelAttributesforConfiguringLocalandPrivateMemorySystems 12.2.1.RestrictionsontheUsageofVariable-SpecificAttributes 12.3.KernelAttributesforReducingtheOverheadonHardwareUsage 12.3.1.HardwareforKernelInterface 12.3.1.1.OmitHardwarethatGeneratesandDispatchesKernelIDs 12.3.1.2.OmitCommunicationHardwarebetweentheHostandtheKernel 12.4.KernelReplicationUsingthenum_compute_units(X,Y,Z)Attribute 12.4.1.CustomizationofReplicatedKernelsUsingtheget_compute_id()Function 12.4.2.UsingChannelswithKernelCopies A.SupportStatusesofOpenCLFeatures A.1.SupportStatusesofOpenCL1.0Features A.1.1.OpenCL1.0CProgrammingLanguageImplementation A.1.2.OpenCLCProgrammingLanguageRestrictions A.1.3.ArgumentTypesforBuilt-inGeometricFunctions A.1.4.NumericalComplianceImplementation A.1.5.ImageAddressingandFilteringImplementation A.1.6.AtomicFunctions A.1.7.EmbeddedProfileImplementation A.2.SupportStatusesofOpenCL1.2Features A.2.1.OpenCL1.2RuntimeImplementation A.2.2.OpenCL1.2CProgrammingLanguageImplementation A.3.SupportStatusesofOpenCL2.0Features A.3.1.OpenCL2.0Headers A.3.2.OpenCL2.0RuntimeImplementation A.3.3.OpenCL2.0CProgrammingLanguageRestrictionsforPipes A.4.Intel®FPGASDKforOpenCL™StandardEditionAllocationLimits B.DocumentRevisionHistoryoftheIntel®FPGASDKforOpenCL™StandardEditionProgrammingGuide VisibletoIntelonly— GUID: ion1521051240027 Ixiasoft ViewDetails 12.1.OpenCLLibrary AnOpenCL™libraryisasinglefilethatcontainsmultiplefunctions.Eachfunctioniscomprisedofdataprocessinglogicthatworksatanyclockfrequency.YoucancreateanOpenCLlibraryinOpenCLorregistertransferlevel(RTL).YoucanthenincludethislibraryfileandusethefunctionsinsideyourOpenCLkernels. Figure15. OverviewofIntel®FPGASDKforOpenCL™'sLibrarySupport Youmayuseapreviously-createdlibraryorcreateyourownlibrary.TouseanOpenCLlibrary,youdonotrequirein-depthknowledgeinhardwaredesignorintheimplementationoflibrarycomponents.TocreateanOpenCLlibrary,youneedtocreatethefollowingfilesandcomponents: Table5. NecessaryFilesandComponentsforCreatinganOpenCLLibrary FileorComponent Description RTLComponents RTLsourcefiles Verilog,SystemVerilog,orVHDLfilesthatdefinetheRTLcomponent.AdditionalfilessuchasIntel®Quartus®PrimeIPFile(.qip),SynopsysDesignConstraintsFile(.sdc),andTclScriptFile(.tcl)arenotallowed. eXtensibleMarkupLanguageFile(.xml) DescribesthepropertiesoftheRTLcomponent.TheIntel®FPGASDKforOpenCL™OfflineCompilerusesthesepropertiestointegratetheRTLcomponentintotheOpenCLpipeline. Headerfile(.h) AC-styleheaderfilethatdeclaresthesignaturesoffunction(s)thatareimplementbytheRTLcomponent. OpenCLemulationmodelfile(.cl) ProvidesCmodelfortheRTLcomponentthatisusedonlyforemulation.FullhardwarecompilationsusetheRTLsourcefiles. OpenCLFunctions OpenCLsourcefiles(.cl) ContainsdefinitionsoftheOpenCLfunctions.Thesefunctionsareusedduringemulationandfullhardwarecompilations. Headerfile(.h) AC-styleheaderfilethatdeclaresthesignaturesoffunction(s)thataredefinedintheOpenCLsourcefiles. Remember:ThereisnodifferenceintheheaderfileusedforRTLandOpenCLlibraryfunctions.Asingleheaderfilecanhavebothtypesoffunctionsdeclared.AsinglelibrarycancontainbothRTLandOpenCLlibraryfunctions. SectionContent UnderstandingRTLModulesandtheOpenCLPipeline PackaginganOpenCLHelperFunctionFileforanOpenCLLibrary PackaginganRTLComponentforanOpenCLLibrary VerifyingtheRTLModules PackagingMultipleObjectFilesintoaLibraryFile SpecifyinganOpenCLLibrarywhenCompilinganOpenCLKernel UsinganOpenCLLibrarythatWorkswithSimpleFunctions(Example1) UsinganOpenCLLibrarythatWorkswithExternalMemory(Example2) OpenCLLibraryCommand-LineOptions RelatedInformation OpenCLLibraryCommand-LineOptions LevelTwoTitle GiveFeedback Didyoufindtheinformationonthispageuseful? Yes No Howcouldweimprovethepage? Charactersremaining: SendFeedback FeedbackMessage Ok GetHelp
延伸文章資訊
- 1Is OpenCL dead? - Quora
- 2在Windows 下使用OpenCL - Hotball's Hive
Header files 就在include 目錄中,而程式庫檔案則在lib 裡面。NVIDIA 的SDK 也是類似,有一個環境變數 NVSDKCOMPUTE_ROOT。不過,因為NVIDIA ...
- 3OpenCL Libraries and Toolkits - IWOCL
- 4OpenCL - Wikipedia
OpenCL (Open Computing Language) is a framework for writing programs that execute across heteroge...
- 512.1. OpenCL Library - Intel
An OpenCL™ library is a single file that contains multiple functions. Each function is comprised ...