Skip to content

Commit 1b2e803

Browse files
committed
HBASE-17554 Figure 2.0.0 Hadoop Version Support; update refguide
Minor edit on hadoop section. Mark 2.8.3 as supported and 2.8.2 as NT.
1 parent 118c1a1 commit 1b2e803

1 file changed

Lines changed: 15 additions & 11 deletions

File tree

src/main/asciidoc/_chapters/configuration.adoc

Lines changed: 15 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
3030
This chapter expands upon the <<getting_started>> chapter to further explain configuration of Apache HBase.
3131
Please read this chapter carefully, especially the <<basic.prerequisites,Basic Prerequisites>>
32-
to ensure that your HBase testing and deployment goes smoothly, and prevent data loss.
32+
to ensure that your HBase testing and deployment goes smoothly.
3333
Familiarize yourself with <<hbase_supported_tested_definitions>> as well.
3434
3535
== Configuration Files
@@ -164,9 +164,9 @@ It is recommended to raise the ulimit to at least 10,000, but more likely 10,240
164164
+
165165
For example, assuming that a schema had 3 ColumnFamilies per region with an average of 3 StoreFiles per ColumnFamily, and there are 100 regions per RegionServer, the JVM will open `3 * 3 * 100 = 900` file descriptors, not counting open JAR files, configuration files, and others. Opening a file does not take many resources, and the risk of allowing a user to open too many files is minimal.
166166
+
167-
Another related setting is the number of processes a user is allowed to run at once. In Linux and Unix, the number of processes is set using the `ulimit -u` command. This should not be confused with the `nproc` command, which controls the number of CPUs available to a given user. Under load, a `ulimit -u` that is too low can cause OutOfMemoryError exceptions. See Jack Levin's major HDFS issues thread on the hbase-users mailing list, from 2011.
167+
Another related setting is the number of processes a user is allowed to run at once. In Linux and Unix, the number of processes is set using the `ulimit -u` command. This should not be confused with the `nproc` command, which controls the number of CPUs available to a given user. Under load, a `ulimit -u` that is too low can cause OutOfMemoryError exceptions.
168168
+
169-
Configuring the maximum number of file descriptors and processes for the user who is running the HBase process is an operating system configuration, rather than an HBase configuration. It is also important to be sure that the settings are changed for the user that actually runs HBase. To see which user started HBase, and that user's ulimit configuration, look at the first line of the HBase log for that instance. A useful read setting config on your hadoop cluster is Aaron Kimball's Configuration Parameters: What can you just ignore?
169+
Configuring the maximum number of file descriptors and processes for the user who is running the HBase process is an operating system configuration, rather than an HBase configuration. It is also important to be sure that the settings are changed for the user that actually runs HBase. To see which user started HBase, and that user's ulimit configuration, look at the first line of the HBase log for that instance.
170170
+
171171
.`ulimit` Settings on Ubuntu
172172
====
@@ -201,7 +201,8 @@ See link:https://wiki.apache.org/hadoop/Distributions%20and%20Commercial%20Suppo
201201
.Hadoop 2.x is recommended.
202202
[TIP]
203203
====
204-
Hadoop 2.x is faster and includes features, such as short-circuit reads, which will help improve your HBase random read profile.
204+
Hadoop 2.x is faster and includes features, such as short-circuit reads (see <<perf.hdfs.configs.localread>>),
205+
which will help improve your HBase random read profile.
205206
Hadoop 2.x also includes important bug fixes that will improve your overall HBase experience. HBase does not support running with
206207
earlier versions of Hadoop. See the table below for requirements specific to different HBase versions.
207208
@@ -226,7 +227,8 @@ Use the following legend to interpret this table:
226227
|Hadoop-2.7.0 | X | X | X
227228
|Hadoop-2.7.1+ | S | S | S
228229
|Hadoop-2.8.[0-1] | X | X | X
229-
|Hadoop-2.8.2+ | NT | NT | NT
230+
|Hadoop-2.8.2 | NT | NT | NT
231+
|Hadoop-2.8.3+ | NT | NT | S
230232
|Hadoop-2.9.0 | X | X | X
231233
|Hadoop-3.0.0 | NT | NT | NT
232234
|===
@@ -252,18 +254,20 @@ data loss. This patch is present in Apache Hadoop releases 2.6.1+.
252254
.Hadoop 2.y.0 Releases
253255
[TIP]
254256
====
255-
Starting around the time of Hadoop version 2.7.0, the Hadoop PMC got into the habit of calling out new minor releases on their major version 2 release line as not stable / production ready. As such, HBase expressly advises downstream users to avoid running on top of these releases. Note that additionally the 2.8.1 was release was given the same caveat by the Hadoop PMC. For reference, see the release announcements for link:https://s.apache.org/hadoop-2.7.0-announcement[Apache Hadoop 2.7.0], link:https://s.apache.org/hadoop-2.8.0-announcement[Apache Hadoop 2.8.0], link:https://s.apache.org/hadoop-2.8.1-announcement[Apache Hadoop 2.8.1], and link:https://s.apache.org/hadoop-2.9.0-announcement[Apache Hadoop 2.9.0].
257+
Starting around the time of Hadoop version 2.7.0, the Hadoop PMC got into the habit of calling out new minor releases on their major version 2 release line as not stable / production ready. As such, HBase expressly advises downstream users to avoid running on top of these releases. Note that additionally the 2.8.1 release was given the same caveat by the Hadoop PMC. For reference, see the release announcements for link:https://s.apache.org/hadoop-2.7.0-announcement[Apache Hadoop 2.7.0], link:https://s.apache.org/hadoop-2.8.0-announcement[Apache Hadoop 2.8.0], link:https://s.apache.org/hadoop-2.8.1-announcement[Apache Hadoop 2.8.1], and link:https://s.apache.org/hadoop-2.9.0-announcement[Apache Hadoop 2.9.0].
256258
====
257259

258260
.Replace the Hadoop Bundled With HBase!
259261
[NOTE]
260262
====
261-
Because HBase depends on Hadoop, it bundles an instance of the Hadoop jar under its _lib_ directory.
262-
The bundled jar is ONLY for use in standalone mode.
263+
Because HBase depends on Hadoop, it bundles Hadoop jars under its _lib_ directory.
264+
The bundled jars are ONLY for use in standalone mode.
263265
In distributed mode, it is _critical_ that the version of Hadoop that is out on your cluster match what is under HBase.
264-
Replace the hadoop jar found in the HBase lib directory with the hadoop jar you are running on your cluster to avoid version mismatch issues.
265-
Make sure you replace the jar in HBase across your whole cluster.
266-
Hadoop version mismatch issues have various manifestations but often all look like its hung.
266+
Replace the hadoop jars found in the HBase lib directory with the equivalent hadoop jars from the version you are running
267+
on your cluster to avoid version mismatch issues.
268+
Make sure you replace the jars under HBase across your whole cluster.
269+
Hadoop version mismatch issues have various manifestations. Check for mismatch if
270+
HBase appears hung.
267271
====
268272

269273
[[dfs.datanode.max.transfer.threads]]

0 commit comments

Comments
 (0)