星期二, 13 5 月, 2025
ZKE News
  • Home
  • Live Crypto Prices
  • Crypto News
    • Bitcoin
    • Altcoins
  • NFT News
  • Blockchain
  • Regulations
  • Scams
No Result
View All Result
  • Home
  • Live Crypto Prices
  • Crypto News
    • Bitcoin
    • Altcoins
  • NFT News
  • Blockchain
  • Regulations
  • Scams
No Result
View All Result
ZKE News
No Result
View All Result

NVIDIA Launches GenAI-Perf for Optimizing Generative AI Model Performance

by NZU
2 8 月, 2024
in Blockchain
0
NVIDIA Launches GenAI-Perf for Optimizing Generative AI Model Performance

Related articles

NFT Sales Rise +10% To $115M This Week – InsideBitcoins

NFT Sales Rise +10% To $115M This Week – InsideBitcoins

12 5 月, 2025
Bitcoin NFTs Pump +70% In Daily Sales, As BTC Nears ATH

Bitcoin NFTs Pump +70% In Daily Sales, As BTC Nears ATH

11 5 月, 2025


Timothy Morano
Aug 02, 2024 02:46

NVIDIA introduces GenAI-Perf, a new tool for benchmarking generative AI models, enhancing performance measurement and optimization.





NVIDIA has unveiled a new tool, GenAI-Perf, aimed at enhancing the performance measurement and optimization of generative AI models. According to the NVIDIA Technical Blog, this tool is incorporated into the latest release of NVIDIA Triton and is designed to aid machine learning engineers in finding the optimal balance between latency and throughput, especially crucial for large language models (LLMs).

Key Metrics for LLM Performance

When dealing with LLMs, performance metrics extend beyond traditional latency and throughput. Key metrics include:

  • Time to first token: The time between when a request is sent and the receipt of the first response.
  • Output token throughput: The number of output tokens generated per second.
  • Inter-token latency: The time between intermediate responses divided by the number of generated tokens.

These metrics are essential for applications where quick and consistent performance is paramount, with time to first token often being the highest priority.

Introducing GenAI-Perf

GenAI-Perf is designed to accurately measure these specific metrics, helping users determine optimal configurations for peak performance and cost-effectiveness. The tool supports industry-standard datasets like OpenOrca and CNN_dailymail and facilitates standardized performance evaluations across various inference engines through an OpenAI-compatible API.

GenAI-Perf is intended to be the default benchmarking tool for all NVIDIA generative AI offerings, including NVIDIA NIM, NVIDIA Triton Inference Server, and NVIDIA TensorRT-LLM. This facilitates easy comparisons among different serving solutions that support the OpenAI-compatible API.

Supported Endpoints and Usage

Currently, GenAI-Perf supports three OpenAI endpoint APIs: Chat, Chat Completions, and Embeddings. As new model types emerge, additional endpoints will be introduced. GenAI-Perf is also open source, accepting community contributions.

To get started with GenAI-Perf, users can install the latest Triton Inference Server SDK container from NVIDIA GPU Cloud. Running the container and server involves specific commands tailored to the type of model being used, such as GPT2 for chat and chat-completion endpoints, and intfloat/e5-mistral-7b-instruct for embeddings.

Profiling and Results

For profiling OpenAI chat-compatible models, users can run specific commands to measure performance metrics such as request latency, output sequence length, and input sequence length. Sample results for GPT2 show metrics like:

  • Request latency (ms): Average of 1679.30, with a minimum of 567.31 and a maximum of 2929.26.
  • Output sequence length: Average of 453.43, ranging from 162 to 784.
  • Output token throughput (per sec): 269.99.

Similarly, for profiling OpenAI embeddings-compatible models, users can generate a JSONL file with sample texts and run GenAI-Perf to obtain metrics such as request latency and request throughput.

Conclusion

GenAI-Perf provides a comprehensive solution for benchmarking generative AI models, offering insights into critical performance metrics and facilitating optimization. As an open-source tool, it allows for continuous improvement and adaptation to new model types and requirements.

Image source: Shutterstock


Credit: Source link

Previous Post

300x XRP Price Predictions Dismissed By Crypto Analyst, Says They Are Highly Unrealistic

Next Post

Lost Dogs Telegram Game Launches with NFTs and Token Rewards

Related Posts

NFT Sales Rise +10% To $115M This Week – InsideBitcoins

NFT Sales Rise +10% To $115M This Week – InsideBitcoins

12 5 月, 2025

Join Our Telegram ch...

Bitcoin NFTs Pump +70% In Daily Sales, As BTC Nears ATH

Bitcoin NFTs Pump +70% In Daily Sales, As BTC Nears ATH

11 5 月, 2025

Join Our Telegram ch...

NFT Sales Jump +40% In The Past 24 Hrs – Are NFTs Back?

NFT Sales Jump +40% In The Past 24 Hrs – Are NFTs Back?

9 5 月, 2025

Join Our Telegram ch...

US-UK Trade Deal Pushes Bitcoin Over 100K, ETH And Pepe Soar

US-UK Trade Deal Pushes Bitcoin Over 100K, ETH And Pepe Soar

9 5 月, 2025

Join Our Telegram ch...

Meta Exploring Stablecoin Payments For Its Products: Report

Meta Exploring Stablecoin Payments For Its Products: Report

9 5 月, 2025

Join Our Telegram ch...

Load More

发表回复 取消回复

您的邮箱地址不会被公开。 必填项已用 * 标注

Ethereum whales hold tight: Can Pectra be ETH’s ticket to $2,000?

Ethereum whales hold tight: Can Pectra be ETH’s ticket to $2,000?

6 5 月, 2025
Ripple’s XRP Ledger: Transforming DeFi Payments with Innovative Solutions

Ripple’s XRP Ledger: Transforming DeFi Payments with Innovative Solutions

7 5 月, 2025
Artificial Superintelligence Alliance breaks its downtrend: Can FET shoot past $1?

Artificial Superintelligence Alliance breaks its downtrend: Can FET shoot past $1?

10 5 月, 2025
Trump’s Crypto Dinners & New Crypto Democratization Bill Could Rally These Best Altcoins

Trump’s Crypto Dinners & New Crypto Democratization Bill Could Rally These Best Altcoins

6 5 月, 2025
BingX Makes DeFi Easier With ChainSpot’s Seamless Access

BingX Makes DeFi Easier With ChainSpot’s Seamless Access

10 5 月, 2025

ZKE NEWS

ZKE News is an online news source that provides the latest updates on crypto news, including Bitcoin, Altcoin, Blockchain, NFT news, crypto regulation, scams, and much more.

Categories

  • Altcoins
  • Bitcoin
  • Blockchain
  • Crypto News
  • NFT News
  • Regulations
  • Scams

Tags

Altcoins Bitcoin Blockchain Crypto News NFT News Regulations Scams
  • Contact Us
  • Privacy Policy
  • Terms of Use
  • DMCA

© 2023 - news.zke.us - All Rights Reserved!

No Result
View All Result
  • Home
  • Live Crypto Prices
  • Crypto News
    • Bitcoin
    • Altcoins
  • NFT News
  • Blockchain
  • Regulations
  • Scams

© 2018 JNews by Jegtheme.

  • bitcoinBitcoin(BTC)$57,792.00-0.07%
  • ethereumEthereum(ETH)$3,102.631.60%
  • tetherTether(USDT)$1.00-0.06%
  • binancecoinBNB(BNB)$522.791.33%
  • solanaSolana(SOL)$141.960.36%
  • usd-coinUSDC(USDC)$1.000.04%
  • staked-etherLido Staked Ether(STETH)$3,109.901.85%
  • rippleXRP(XRP)$0.4379300.51%
  • ToncoinToncoin(TON)$7.21-1.35%
  • dogecoinDogecoin(DOGE)$0.1074050.22%
  • cardanoCardano(ADA)$0.3837042.22%
  • tronTRON(TRX)$0.1312550.99%
  • avalanche-2Avalanche(AVAX)$25.71-1.89%
  • shiba-inuShiba Inu(SHIB)$0.0000160.46%
  • wrapped-bitcoinWrapped Bitcoin(WBTC)$57,635.00-0.37%
  • polkadotPolkadot(DOT)$6.120.47%
  • chainlinkChainlink(LINK)$12.75-0.39%
  • bitcoin-cashBitcoin Cash(BCH)$337.722.22%
  • uniswapUniswap(UNI)$8.060.22%
  • leo-tokenLEO Token(LEO)$5.82-0.47%
  • daiDai(DAI)$1.00-0.17%
  • nearNEAR Protocol(NEAR)$4.601.94%
  • litecoinLitecoin(LTC)$66.672.01%
  • matic-networkPolygon(MATIC)$0.512.19%
  • Wrapped eETHWrapped eETH(WEETH)$3,226.781.47%
  • KaspaKaspa(KAS)$0.170006-0.24%
  • PepePepe(PEPE)$0.0000091.70%
  • Ethena USDeEthena USDe(USDE)$1.000.08%
  • internet-computerInternet Computer(ICP)$7.18-0.70%
  • Renzo Restaked ETHRenzo Restaked ETH(EZETH)$3,141.781.50%
  • ethereum-classicEthereum Classic(ETC)$20.921.45%
  • fetch-aiArtificial Superintelligence Alliance(FET)$1.19-0.56%
  • moneroMonero(XMR)$156.200.91%
  • AptosAptos(APT)$6.111.91%
  • stellarStellar(XLM)$0.087154-0.62%
  • render-tokenRender(RNDR)$6.491.31%
  • hedera-hashgraphHedera(HBAR)$0.065939-3.04%
  • cosmosCosmos Hub(ATOM)$6.031.74%
  • ArbitrumArbitrum(ARB)$0.711.89%
  • crypto-com-chainCronos(CRO)$0.084440-1.84%
  • filecoinFilecoin(FIL)$3.961.20%
  • blockstackStacks(STX)$1.5311.41%
  • MantleMantle(MNT)$0.695.02%
  • okbOKB(OKB)$36.790.61%
  • makerMaker(MKR)$2,306.631.51%
  • vechainVeChain(VET)$0.0258250.28%
  • injective-protocolInjective(INJ)$20.54-0.30%
  • First Digital USDFirst Digital USD(FDUSD)$1.00-0.03%
  • immutable-xImmutable(IMX)$1.25-1.34%
  • optimismOptimism(OP)$1.664.89%
  • BittensorBittensor(TAO)$260.472.91%
  • SuiSui(SUI)$0.73-1.44%
  • the-graphThe Graph(GRT)$0.1857292.27%
  • BonkBonk(BONK)$0.0000263.40%
  • Rocket Pool ETHRocket Pool ETH(RETH)$3,463.321.46%
  • NotcoinNotcoin(NOT)$0.015896-3.20%
  • dogwifhatdogwifhat(WIF)$1.62-3.51%
  • Mantle Staked EtherMantle Staked Ether(METH)$3,217.261.40%
  • lido-daoLido DAO(LDO)$1.675.99%
  • arweaveArweave(AR)$22.490.58%
  • Bitget TokenBitget Token(BGB)$1.040.87%
  • FLOKIFLOKI(FLOKI)$0.0001493.91%
  • OndoOndo(ONDO)$0.984.71%
  • WhiteBIT CoinWhiteBIT Coin(WBT)$9.570.50%
  • theta-tokenTheta Network(THETA)$1.361.96%
  • CelestiaCelestia(TIA)$6.80-4.62%
  • aaveAave(AAVE)$90.175.14%
  • fantomFantom(FTM)$0.4687043.51%
  • thorchainTHORChain(RUNE)$3.612.50%
  • jasmycoinJasmyCoin(JASMY)$0.0247126.62%
  • BrettBrett(BRETT)$0.118640-0.80%
  • algorandAlgorand(ALGO)$0.138571-0.33%
  • ether.fi Staked ETHether.fi Staked ETH(EETH)$3,090.241.21%
  • Pyth NetworkPyth Network(PYTH)$0.3017931.24%
  • JupiterJupiter(JUP)$0.78-1.73%
  • quant-networkQuant(QNT)$70.02-1.87%
  • elrond-erd-2MultiversX(EGLD)$37.221.39%
  • SeiSei(SEI)$0.328578-3.83%
  • CoreCore(CORE)$1.103.06%
  • gatechain-tokenGate(GT)$6.941.56%
  • ethereum-name-serviceEthereum Name Service(ENS)$27.130.86%
  • akash-networkAkash Network(AKT)$3.54-0.39%
  • kucoin-sharesKuCoin(KCS)$8.94-1.67%
  • FlareFlare(FLR)$0.019127-1.31%
  • flowFlow(FLOW)$0.551.29%
  • dYdXdYdX(DYDX)$1.321.96%
  • mantra-daoMANTRA(OM)$0.960.88%
  • Kelp DAO Restaked ETHKelp DAO Restaked ETH(RSETH)$3,139.661.41%
  • axie-infinityAxie Infinity(AXS)$5.361.33%
  • galaGALA(GALA)$0.021604-0.81%
  • eosEOS(EOS)$0.520.36%
  • Tokenize XchangeTokenize Xchange(TKX)$9.640.57%
  • StarknetStarknet(STRK)$0.59-0.68%
  • bittorrentBitTorrent(BTT)$0.0000011.27%
  • msolMarinade Staked SOL(MSOL)$169.76-0.40%
  • BeamBeam(BEAM)$0.0148692.24%
  • FasttokenFasttoken(FTN)$2.340.62%
  • bitcoin-cash-svBitcoin SV(BSV)$38.061.96%
  • usddUSDD(USDD)$1.000.40%
  • tezosTezos(XTZ)$0.74-0.96%