分类导航

程序问答发布时间：2022-06-01 发布网站：大佬教程 code.js-code.com

大佬教程收集整理的这篇文章主要介绍了DynamoDB get_item 以毫秒为单位读取 400kb 数据，大佬教程大佬觉得挺不错的，现在分享给大家，也给大家做个参考。

如何解决DynamoDB get_item 以毫秒为单位读取 400kb 数据？

开发过程中遇到DynamoDB get_item 以毫秒为单位读取 400kb 数据的问题如何解决？下面主要结合日常开发的经验，给出你关于DynamoDB get_item 以毫秒为单位读取 400kb 数据的解决方法建议，希望对你解决DynamoDB get_item 以毫秒为单位读取 400kb 数据有所启发或帮助；

我有一个名为 events 的 dynamodb 表，我在其中存储了所有 user event details，例如 product_vIEw 、add_to_cart 和 product_purchase

在这个 events 表中，我有一些 items 的存储容量达到了 400kb

问题：

        response = self._table.get_item(
            Key={
                PARTITION_KEY: <pk>,SORT_KEY: <sk>,},ConsistentRead=false,)

当我想使用 dynamodb get_item 方法访问 item(400kb) 时，大约需要 5 seconds 才能返回结果。

我已经使用过 DAX

目标

我想在 1 秒内阅读 400kb 项。

重要信息：

dynamodb 中的数据将以此格式存储

{
 "partition_key": "user_iD1111","sort_key": "version_1","attributes": {
  "events": [
   {
    "t": "1614712316","a": "product_vIEw","i": "1275"
   },{
    "t": "1614712316","a": "product_add","a": "product_purchase",...

  ]
 }
}

t 是时间戳
a 可能是 product_vIEw,product_add,product_purchase
i 是 product_ID

如果您看到上面的项目 events 是一个列表，它将被新事件附加。

我有一个项目是 400kb，事件数在 events 列表中

我写了一些脚本来测量时间，结果如下

import boto3
import datetiR_354_11845@e

dynamodb = boto3.resource('dynamodb')

table = dynamodb.table('events')

pk = f"user_iD1111"
sk = f"version_1"


t_load_start = datetiR_354_11845@e.datetiR_354_11845@e.Now()


response = table.get_item(
    Key={
        "partition_key": pk,"sort_key": sk,ReturnConsumedCapacity="@R_279_10586@L"
)
capacity_units = response["ConsumedCapacity"]["CapacityUnits"]

t_load_end = datetiR_354_11845@e.datetiR_354_11845@e.Now()
seconds = (t_load_end - t_load_start).@R_279_10586@l_seconds()

print(f"Elapsed time is::{seconds}sec and {Capacity_units} capacity units")

这是我得到的输出。

Elapsed time is::5.676799sec and 50.0 capacity units

有人可以为此提出解决方案吗？

解决方法

tl;dr：将您的函数内存增加到至少 1024MB，请参阅更新 2

我很好奇，所以我做了一些测量。我创建了一个脚本，用于在新表中创建一个大小几乎为 400KB 的大 boi 项。

然后我测试了 Python 的两次读取 - 一次使用资源 API，另一次使用较低级别的客户端 - 最终在两种情况下读取一致。

这是我测量的：

Reading Big Boi from a Table resource took 0.366508s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.301585s and consumed 50.0 RCUs

如果我们从 RCU 推断，它读取的项目大小约为 50 * 2 * 4KB = 400 KB（最终一致性读取消耗 0.5 个 RCU）。

我在德国本地针对 eu-central-1（德国法兰克福）运行了几次，我看到的最高延迟约为 900 毫秒。（这没有 DAX。）

因此，我认为您应该向我们展示您是如何进行测量的。

import uuid
from datetiR_354_11845@e import datetiR_354_11845@e,timedelta

import boto3
import boto3.dynamodb.conditions as conditions

TABLE_NAME = "big-boi-test"
BIG_BOI_PK = "f0ba8d6c"

TABLE_resourcE = boto3.resource("dynamodb").Table(TABLE_Name)
DDB_CLIENT = boto3.client("dynamodb")

def create_table():
    DDB_CLIENT.create_table(
        AttributeDefinitions=[{"Attributename": "PK","AttributeType": "S"}],Tablename=TABLE_NAME,KeyscheR_354_11845@a=[{"Attributename": "PK","KeyType": "HASH"}],BillingMode="PAY_PER_requEST"
    )

def create_big_boi_item() -> str:
    # based on calculations here: https://zaccharles.github.io/dynamodb-calculator/
    template = {
        "PK": {
            "S": BIG_BOI_PK
        },"bigBoi": {
            "S": ""
        }
    } # This is 16 bytes

    big_boi = "X" * (1024 * 400 - 16)
    template["bigBoi"]["S"] = big_boi
    return template

def store_big_boi():
    big_bio = create_big_boi_item()

    DDB_CLIENT.put_item(
        Item=big_bio,Tablename=TABLE_NAME
    )

def geT_Big_boi_with_table_resource():

    start = datetiR_354_11845@e.now()
    response = TABLE_resourcE.get_item(
        Key={"PK": BIG_BOI_PK},ReturnConsumedCapacity="@R_279_10586@L"
    )
    end = datetiR_354_11845@e.now()
    seconds = (end - start).@R_279_10586@l_seconds()
    capacity_units = response["ConsumedCapacity"]["CapacityUnits"]

    print(f"Reading Big Boi from a Table resource took {seconds}s and consumed {Capacity_units} RCUs")

def geT_Big_boi_with_client():

    start = datetiR_354_11845@e.now()
    response = DDB_CLIENT.get_item(
        Key={"PK": {"S": BIG_BOI_PK}},ReturnConsumedCapacity="@R_279_10586@L",Tablename=TABLE_NAME
    )
    end = datetiR_354_11845@e.now()
    seconds = (end - start).@R_279_10586@l_seconds()
    capacity_units = response["ConsumedCapacity"]["CapacityUnits"]

    print(f"Reading Big Boi from a Client took {seconds}s and consumed {Capacity_units} RCUs")

if __name__ == "__main__":
    # create_table()
    # store_big_boi()
    geT_Big_boi_with_table_resource()
    geT_Big_boi_with_client()

更新

我对一个看起来更像您使用的那个的项目再次进行了相同的测量，无论我以何种方式请求它们，我的平均测量值仍然低于 1000 毫秒：

Reading Big Boi from a Table resource took 1.492829s and consumed 50.0 RCUs
Reading Big Boi from a Table resource took 0.871583s and consumed 50.0 RCUs
Reading Big Boi from a Table resource took 0.857513s and consumed 50.0 RCUs
Reading Big Boi from a Table resource took 0.769432s and consumed 50.0 RCUs
Reading Big Boi from a Table resource took 0.690172s and consumed 50.0 RCUs
Reading Big Boi from a Table resource took 0.670099s and consumed 50.0 RCUs
Reading Big Boi from a Table resource took 0.633489s and consumed 50.0 RCUs
Reading Big Boi from a Table resource took 0.605999s and consumed 50.0 RCUs
Reading Big Boi from a Table resource took 0.598635s and consumed 50.0 RCUs
Reading Big Boi from a Table resource took 0.606553s and consumed 50.0 RCUs
Reading Big Boi from a Client took 1.66636s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.921605s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.831735s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.707082s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.668602s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.648401s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.5695s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.592073s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.611436s and consumed 50.0 RCUs
Reading Big Boi from a Client took 0.553827s and consumed 50.0 RCUs
Average latency over 10 requests with the table resource: 0.7796304s
Average latency over 10 requests with the client: 0.7770621s

这是项目的样子：

DynamoDB get_item 以毫秒为单位读取 400kb 数据

这是供您验证的完整测试脚本：

import statistics
import uuid
from datetiR_354_11845@e import datetiR_354_11845@e,timedelta

import boto3
import boto3.dynamodb.conditions as conditions

TABLE_NAME = "big-boi-test"
BIG_BOI_PK = "nestedBoi"

TABLE_resourcE = boto3.resource("dynamodb").Table(TABLE_Name)
DDB_CLIENT = boto3.client("dynamodb")

def create_table():
    DDB_CLIENT.create_table(
        AttributeDefinitions=[{"Attributename": "PK",BillingMode="PAY_PER_requEST"
    )

def create_big_boi_item() -> str:
    # based on calculations here: https://zaccharles.github.io/dynamodb-calculator/
    template = {
        "PK": {
            "S": "nestedBoi"
        },"bigBoiContainer": {
            "M": {
            "bigBoiList": {
                "L": [
                
                ]
            }
            }
        }
    } # 43 bytes

    item = {
        "M": {
        "t": {
            "S": "1614712316"
        },"a": {
            "S": "product_view"
        },"i": {
            "S": "1275"
        }
        }
    }  # 36 bytes

    number_of_items = int((1024 * 400 - 43) / 36)

    for _ in range(number_of_items):
        template["bigBoiContainer"]["M"]["bigBoiList"]["L"].append(item)

    return template

def store_big_boi():
    big_bio = create_big_boi_item()

    DDB_CLIENT.put_item(
        Item=big_bio,ReturnConsumedCapacity="@R_279_10586@L"
    )
    end = datetiR_354_11845@e.now()
    seconds = (end - start).@R_279_10586@l_seconds()
    capacity_units = response["ConsumedCapacity"]["CapacityUnits"]

    print(f"Reading Big Boi from a Table resource took {seconds}s and consumed {Capacity_units} RCUs")

    return seconds

def geT_Big_boi_with_client():

    start = datetiR_354_11845@e.now()
    response = DDB_CLIENT.get_item(
        Key={"PK": {"S": BIG_BOI_PK}},Tablename=TABLE_NAME
    )
    end = datetiR_354_11845@e.now()
    seconds = (end - start).@R_279_10586@l_seconds()
    capacity_units = response["ConsumedCapacity"]["CapacityUnits"]

    print(f"Reading Big Boi from a Client took {seconds}s and consumed {Capacity_units} RCUs")

    return seconds

if __name__ == "__main__":
    # create_table()
    # store_big_boi()

    n_experiments = 10
    experiments_with_table_resource = [geT_Big_boi_with_table_resource() for i in range(n_experiments)]
    experiments_with_client = [geT_Big_boi_with_client() for i in range(n_experiments)]
    print(f"Average latency over {n_experiments} requests with the table resource: {statistics.mean(experiments_with_table_resourcE)}s")
    print(f"Average latency over {n_experiments} requests with the client: {statistics.mean(experiments_with_client)}s")

如果我增加 n_experiments，它往往会变得更快，可能是因为 DDB 内部缓存。

仍然：无法重现。

更新 2

在得知您正在运行 Lambda 函数后，我在具有不同内存配置的 Lambda 内部再次运行了测试。

@H_607_248@

内存	n_experiments	使用资源的平均时间	与客户的平均时间
128MB	10	6.28s	5.06s
256MB	10	3.26s	2.61s
512MB	10	1.62s	1.33s
1024MB	10	0.84s	0.68s
2048MB	10	0.52s	0.43s
4096MB	10	0.51s	0.41s

如评论中所述，CPU 和网络性能随您分配给函数的内存量而变化。你可以通过投钱来解决你的问题:-)

听起来您遇到了一些问题。第一个问题是您遇到了 400kb 项目大小限制。虽然您没有说这是一个问题，但可能值得重新审视您的数据模型，以便您可以存储更多事件数据。

性能问题不太可能与您的数据模型有关。 get_item 操作的平均延迟应为个位数毫秒，尤其是当您指定最终一致的读取时。这里还有其他事情要做。

您如何测试和衡量此操作的性能？

AWS 文档从 about troubleshooTing high latency DynamoDB operations 中提供了一些可能有用的建议。

大佬总结

以上是大佬教程为你收集整理的DynamoDB get_item 以毫秒为单位读取 400kb 数据全部内容，希望文章能够帮你解决DynamoDB get_item 以毫秒为单位读取 400kb 数据所遇到的程序开发问题。

如果觉得大佬教程网站内容还不错，欢迎将大佬教程推荐给程序员好友。

本图文内容来源于网友网络收集整理提供，作为学习参考使用，版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ：384754419，请注明来意。

标签：数据

上一篇: 如果不同的对象具有相同的值，则... 下一篇:如何解决加载模块脚本？

猜你在找的程序问答相关文章

在烧瓶中重定向时发出POST请求 2022-06-02
从 CreateWindow() 返回的 HWND 的格式值是多少？ 2022-05-31
使用nodejs打印json对象内容 2022-05-31
useEffect 无限循环仅在测试时发生，否则不会发生 - 尽管使用 useReducer 2022-05-31
从雅虎财经检索 ESG 分数 2022-05-31
Gulp：获取“必须指定任务功能”错误，但我只有 1 个任务 2022-05-31
JavaScript 将平面数组转换为嵌套/分组和排序数组 2022-05-31
405 Method Not Allowed 当提交表单到 Flask 时，即使路由有 ['GET', 'PO... 2022-05-31
Mongodb 错误码和对应的 http 状态码 2022-05-31
连接到上游时 Nginx connect() 失败（111：连接被拒绝），客户端：192.168.128.1，服务... 2022-05-31