In Google File System, computer hot spot files sequentially.?

terabaaphoonmein · Jan 8, 2022

hotspot-: region of computer program where a high proportion of executed instructions occur

Lazy space allocation-:https://stackoverflow.com/questions/18109582/what-is-lazy-space-allocation-in-google-file-system

With lazy space allocation, the physical allocation of space is delayed as long as possible, until data at the size of the chunk size (in GFS's case, 64 MB according the 2003 paper) is accumulated.
Large chunk size in GFS-:
=>A large chunk size, even with lazy space allocation has its disadvantages.
=> A small file consists of a small number of chunks, perhaps just one.
=> The chunkservers storing those chunks may become hot spots if many clients are accessing the same file.
=> In practice hotspots haven't been a major issue because our applications mostly read large multi chunk files sequentially.
I don't understand how hotspots are no issue when we read large multi chunk files sequentially. They say hotspots are issue if clients are accessing same small file(file of just 1 chunk).

I will represent scenario where small file=small no. of chunks is being accesed by multiple clients.

http://imgur.com/a/B2F4VLh

it makes sense why chunkservers will be hotspot in this case as they will be active if they are being accessed by multiple clients.
but it absolutely doesn't make sense when the research paper say " In practice hotspots haven't been a major issue because our applications mostly read large multi chunk files sequentially." What's the difference. If I imagine a scenario like above, here file is made up of multiple chunks and rest is same, what difference is made here?

djsfantasi · Jan 8, 2022

As far as understanding, a paragraph usually contains one aspect of the material under consideration.

In you example, first you have to identify the aspect (subject) being described.

Then, each sentence provides detail. And it expects the reader to mentally expand on the detail.

What are the properties of a large chunked file? For one, there are many chunks. And what about many readers happens? Since there are a larger number of chunks, the probability of many readers hitting the same chunk decreases. Assume 10 simultaneous reads. A small file of one chunk will be simultaneously hit by 10 people. But a large file of 1,000 chunks will have each chunk simultaneously hit by 0.1 people. A difference of 100%!

The point here is that in understanding a paragraph involves a mental extension.

terabaaphoonmein · Jan 9, 2022

djsfantasi said:
As far as understanding, a paragraph usually contains one aspect of the material under consideration.

In you example, first you have to identify the aspect (subject) being described.

Then, each sentence provides detail. And it expects the reader to mentally expand on the detail.

What are the properties of a large chunked file? For one, there are many chunks. And what about many readers happens? Since there are a larger number of chunks, the probability of many readers hitting the same chunk decreases. Assume 10 simultaneous reads. A small file of one chunk will be simultaneously hit by 10 people. But a large file of 1,000 chunks will have each chunk simultaneously hit by 0.1 people. A difference of 100%!

The point here is that in understanding a paragraph involves a mental extension.

Here is me explaining what you told here via figure.

http://imgur.com/a/cj44pTa

I still don't understand why and how this occurs-:
In practice hotspots haven't been a major issue because our applications mostly read large multi chunk files sequentially. Because here still same chunk can be accessed by multiple applications.
Oh I see. But here there is less chance of being big hotspot. There is chance of hotspot but small hotspots only compared to when there are less chunks of files.
Also do you assume that those 1 chunk of file is replicated or not?

terabaaphoonmein · Jan 9, 2022

djsfantasi said:
Since there are a larger number of chunks, the probability of many readers hitting the same chunk decreases. Assume 10 simultaneous reads. A small file of one chunk will be simultaneously hit by 10 people. But a large file of 1,000 chunks will have each chunk simultaneously hit by 0.1 people. A difference of 100%!

Can you explain this?

djsfantasi · Jan 9, 2022

terabaaphoonmein said:
Can you explain this?

Try yourself to think of a way to explain this. Maybe an analogy. Try before reading further.
.
.
.
.
.
.
.
.
.
.
Imagine you have a large barrel (file). In it, there is one tennis ball (chunk). Then, reach in blindfolded and grab the tennis ball (read file), Ok. Now put the ball back and get nine friends to join you. Then, have everyone grab the ball. There WILL be contention (hotspot). Now put 100 tennis balls into the barrel and you and your friends try to grab a ball. Most of the time, everyone will get a ball. Occasionally, there will be contention (hotspot) but it will be far less frequent.

terabaaphoonmein · Jan 9, 2022

djsfantasi said:
Try yourself to think of a way to explain this. Maybe an analogy. Try before reading further.
.
.
.
.
.
.
.
.
.
.
Imagine you have a large barrel (file). In it, there is one tennis ball (chunk). Then, reach in blindfolded and grab the tennis ball (read file), Ok. Now put the ball back and get nine friends to join you. Then, have everyone grab the ball. There WILL be contention (hotspot). Now put 100 tennis balls into the barrel and you and your friends try to grab a ball. Most of the time, everyone will get a ball. Occasionally, there will be contention (hotspot) but it will be far less frequent.

makes sense so we aren't considering files are replicated i see.
this qn has been asked for 10 marks what should i write? the answer is simple
demerits are-:
internal fragmentation and hotspot formation. even in research paper it is written shortly not longly.

Thread starter	Similar threads	Forum	Replies	Date
	How to automate Google Maps?	Software & IDEs	1	Apr 25, 2024
	Problem to send data to google sheets using sim800l and esp32	Programming & Languages	0	Aug 27, 2023
	Google plans to scrape everything you post online to train its AI	Off-Topic	7	Jul 10, 2023
D	please help me register my site in google	Off-Topic	6	Jul 13, 2022
	Google Releases New OS for PC's Free	Programming & Languages	18	Feb 20, 2022

In Google File System, computer hot spot files sequentially.?

Join our Engineering Community! Sign-in with:

In Google File System, computer hot spot files sequentially.?

terabaaphoonmein

djsfantasi

terabaaphoonmein

terabaaphoonmein

djsfantasi

terabaaphoonmein

You May Also Like

Qorvo Targets 5G Radio Complexity With New Wideband RF Switch Family

Edge AI Development Is a Lifecycle Problem

Understanding SNR in DSB‑SC Coherent Detection: A Graphical Approach

Siglent Integrates New Arbitrary Waveform Generator With IQ Modulation